Jackknife Resampling

In statistics, the jackknife (jackknife cross-validation) is a cross-validation technique and, therefore, a form of resampling.

It is especially useful for bias and variance estimation. The jackknife pre-dates other common resampling methods such as the bootstrap. Given a sample of size , a jackknife estimator can be built by aggregating the parameter estimates from each subsample of size obtained by omitting one observation.

Jackknife Resampling
Schematic of Jackknife Resampling

The jackknife technique was developed by Maurice Quenouille (1924–1973) from 1949 and refined in 1956. John Tukey expanded on the technique in 1958 and proposed the name "jackknife" because, like a physical jack-knife (a compact folding knife), it is a rough-and-ready tool that can improvise a solution for a variety of problems even though specific problems may be more efficiently solved with a purpose-designed tool.

The jackknife is a linear approximation of the bootstrap.

A simple example: mean estimation

The jackknife estimator of a parameter is found by systematically leaving out each observation from a dataset and calculating the parameter estimate over the remaining observations and then aggregating these calculations.

For example, if the parameter to be estimated is the population mean of random variable Jackknife Resampling , then for a given set of i.i.d. observations Jackknife Resampling  the natural estimator is the sample mean:

    Jackknife Resampling 

where the last sum used another way to indicate that the index Jackknife Resampling  runs over the set Jackknife Resampling .

Then we proceed as follows: For each Jackknife Resampling  we compute the mean Jackknife Resampling  of the jackknife subsample consisting of all but the Jackknife Resampling -th data point, and this is called the Jackknife Resampling -th jackknife replicate:

    Jackknife Resampling 

It could help to think that these Jackknife Resampling  jackknife replicates Jackknife Resampling  give us an approximation of the distribution of the sample mean Jackknife Resampling  and the larger the Jackknife Resampling  the better this approximation will be. Then finally to get the jackknife estimator we take the average of these Jackknife Resampling  jackknife replicates:

    Jackknife Resampling 

One may ask about the bias and the variance of Jackknife Resampling . From the definition of Jackknife Resampling  as the average of the jackknife replicates one could try to calculate explicitly, and the bias is a trivial calculation but the variance of Jackknife Resampling  is more involved since the jackknife replicates are not independent.

For the special case of the mean, one can show explicitly that the jackknife estimate equals the usual estimate:

    Jackknife Resampling 

This establishes the identity Jackknife Resampling . Then taking expectations we get Jackknife Resampling , so Jackknife Resampling  is unbiased, while taking variance we get Jackknife Resampling . However, these properties do not generally hold for parameters other than the mean.

This simple example for the case of mean estimation is just to illustrate the construction of a jackknife estimator, while the real subtleties (and the usefulness) emerge for the case of estimating other parameters, such as higher moments than the mean or other functionals of the distribution.

Jackknife Resampling  could be used to construct an empirical estimate of the bias of Jackknife Resampling , namely Jackknife Resampling  with some suitable factor Jackknife Resampling , although in this case we know that Jackknife Resampling  so this construction does not add any meaningful knowledge, but it gives the correct estimation of the bias (which is zero).

A jackknife estimate of the variance of Jackknife Resampling  can be calculated from the variance of the jackknife replicates Jackknife Resampling :

    Jackknife Resampling 

The left equality defines the estimator Jackknife Resampling  and the right equality is an identity that can be verified directly. Then taking expectations we get Jackknife Resampling , so this is an unbiased estimator of the variance of Jackknife Resampling .

Estimating the bias of an estimator

The jackknife technique can be used to estimate (and correct) the bias of an estimator calculated over the entire sample.

Suppose Jackknife Resampling  is the target parameter of interest, which is assumed to be some functional of the distribution of Jackknife Resampling . Based on a finite set of observations Jackknife Resampling , which is assumed to consist of i.i.d. copies of Jackknife Resampling , the estimator Jackknife Resampling  is constructed:

    Jackknife Resampling 

The value of Jackknife Resampling  is sample-dependent, so this value will change from one random sample to another.

By definition, the bias of Jackknife Resampling  is as follows:

    Jackknife Resampling 

One may wish to compute several values of Jackknife Resampling  from several samples, and average them, to calculate an empirical approximation of Jackknife Resampling , but this is impossible when there are no "other samples" when the entire set of available observations Jackknife Resampling  was used to calculate Jackknife Resampling . In this kind of situation the jackknife resampling technique may be of help.

We construct the jackknife replicates:

    Jackknife Resampling 
    Jackknife Resampling 
    Jackknife Resampling 
    Jackknife Resampling 

where each replicate is a "leave-one-out" estimate based on the jackknife subsample consisting of all but one of the data points:

    Jackknife Resampling 

Then we define their average:

    Jackknife Resampling 

The jackknife estimate of the bias of Jackknife Resampling  is given by:

    Jackknife Resampling 

and the resulting bias-corrected jackknife estimate of Jackknife Resampling  is given by:

    Jackknife Resampling 

This removes the bias in the special case that the bias is Jackknife Resampling  and reduces it to Jackknife Resampling  in other cases.

Estimating the variance of an estimator

The jackknife technique can be also used to estimate the variance of an estimator calculated over the entire sample.

See also

Literature

Notes

References

Tags:

Jackknife Resampling A simple example: mean estimationJackknife Resampling Estimating the bias of an estimatorJackknife Resampling Estimating the variance of an estimatorJackknife Resampling LiteratureJackknife ResamplingBias of an estimatorBootstrap (statistics)Cross-validation (statistics)EstimatorResampling (statistics)StatisticsVariance

🔥 Trending searches on Wiki English:

ATimothée ChalametRipley (TV series)Tom Goodman-HillThe Age of AdalineUnder the Bridge (TV series)Caleb WilliamsMark Zuckerberg2022 NFL draftBitcoin2024 Croatian parliamentary electionFallout (video game)Crew (film)Supreme Court of the United StatesBastion (comics)IsraelNet neutralityUnited Arab EmiratesKyle Jacobs (songwriter)Split (2016 American film)Helen KellerIlluminatiLate Night with the DevilAlexander the GreatAndrew TateJürgen KloppJean-Philippe MatetaLovely RunnerIman (model)Terence CrawfordHarvey Weinstein sexual abuse casesDelicious in DungeonD. John SauerJosé MourinhoNo Way UpAnn WilsonLaurence FoxAnimal (2023 Indian film)Jennifer LopezSnapchatEnglish language2024 AFC Futsal Asian CupKaren McDougalKim Ji-won (actress)Road House (2024 film)Coral CastleRussell WilsonGitHubGallipoli campaignSandra OhNATOJenifer LewisPromising Young WomanJayson TatumChernobyl disasterO. J. Simpson2024 Indian Premier LeagueSherri MartelItalyXNXXLionel MessiThe Empire Strikes BackMuhammad AliRoyal Challengers BangaloreCrackhead BarneyThe Pirate BayKalanithi MaranFreemasonry2024 Mutua Madrid Open – Men's singlesKevin Porter Jr.Martin Luther King Jr.Sex and the CityClint EastwoodMark WahlbergKate HudsonGaza StripThe GodfatherJulius Caesar🡆 More