Ancillary Statistic

An ancillary statistic is a measure of a sample whose distribution (or whose pmf or pdf) does not depend on the parameters of the model.

An ancillary statistic is a pivotal quantity that is also a statistic. Ancillary statistics can be used to construct prediction intervals. They are also used in connection with Basu's theorem to prove independence between statistics.

This concept was first introduced by Ronald Fisher in the 1920s, but its formal definition was only provided in 1964 by Debabrata Basu.

Examples

Suppose X1, ..., Xn are independent and identically distributed, and are normally distributed with unknown expected value μ and known variance 1. Let

    Ancillary Statistic 

be the sample mean.

The following statistical measures of dispersion of the sample

      Ancillary Statistic 

are all ancillary statistics, because their sampling distributions do not change as μ changes. Computationally, this is because in the formulas, the μ terms cancel – adding a constant number to a distribution (and all samples) changes its sample maximum and minimum by the same amount, so it does not change their difference, and likewise for others: these measures of dispersion do not depend on location.

Conversely, given i.i.d. normal variables with known mean 1 and unknown variance σ2, the sample mean Ancillary Statistic  is not an ancillary statistic of the variance, as the sampling distribution of the sample mean is N(1, σ2/n), which does depend on σ 2 – this measure of location (specifically, its standard error) depends on dispersion.

In location-scale families

In a location family of distributions, Ancillary Statistic  is an ancillary statistic.

In a scale family of distributions, Ancillary Statistic  is an ancillary statistic.

In a location-scale family of distributions, Ancillary Statistic , where Ancillary Statistic  is the sample variance, is an ancillary statistic.

In recovery of information

It turns out that, if Ancillary Statistic  is a non-sufficient statistic and Ancillary Statistic  is ancillary, one can sometimes recover all the information about the unknown parameter contained in the entire data by reporting Ancillary Statistic  while conditioning on the observed value of Ancillary Statistic . This is known as conditional inference.

For example, suppose that Ancillary Statistic  follow the Ancillary Statistic  distribution where Ancillary Statistic  is unknown. Note that, even though Ancillary Statistic  is not sufficient for Ancillary Statistic  (since its Fisher information is 1, whereas the Fisher information of the complete statistic Ancillary Statistic  is 2), by additionally reporting the ancillary statistic Ancillary Statistic , one obtains a joint distribution with Fisher information 2.

Ancillary complement

Given a statistic T that is not sufficient, an ancillary complement is a statistic U that is ancillary and such that (TU) is sufficient. Intuitively, an ancillary complement "adds the missing information" (without duplicating any).

The statistic is particularly useful if one takes T to be a maximum likelihood estimator, which in general will not be sufficient; then one can ask for an ancillary complement. In this case, Fisher argues that one must condition on an ancillary complement to determine information content: one should consider the Fisher information content of T to not be the marginal of T, but the conditional distribution of T, given U: how much information does T add? This is not possible in general, as no ancillary complement need exist, and if one exists, it need not be unique, nor does a maximum ancillary complement exist.

Example

In baseball, suppose a scout observes a batter in N at-bats. Suppose (unrealistically) that the number N is chosen by some random process that is independent of the batter's ability – say a coin is tossed after each at-bat and the result determines whether the scout will stay to watch the batter's next at-bat. The eventual data are the number N of at-bats and the number X of hits: the data (XN) are a sufficient statistic. The observed batting average X/N fails to convey all of the information available in the data because it fails to report the number N of at-bats (e.g., a batting average of 0.400, which is very high, based on only five at-bats does not inspire anywhere near as much confidence in the player's ability than a 0.400 average based on 100 at-bats). The number N of at-bats is an ancillary statistic because

  • It is a part of the observable data (it is a statistic), and
  • Its probability distribution does not depend on the batter's ability, since it was chosen by a random process independent of the batter's ability.

This ancillary statistic is an ancillary complement to the observed batting average X/N, i.e., the batting average X/N is not a sufficient statistic, in that it conveys less than all of the relevant information in the data, but conjoined with N, it becomes sufficient.

See also

Notes

Tags:

Ancillary Statistic ExamplesAncillary Statistic In recovery of informationAncillary Statistic Ancillary complementAncillary StatisticBasu's theoremPivotal quantityPrediction intervalProbability density functionProbability mass functionSample (statistics)Sampling distributionStatisticStatistical parameter

🔥 Trending searches on Wiki English:

Robert F. Kennedy Jr.Queen VictoriaDune (franchise)Immaculate (2024 film)Basque languageICC Men's T20 World CupBrazilRishi SunakFallout (video game)The Idea of YouMeghan TrainorChris PrattBoeing 747Newcastle United F.C.Indian National CongressBrooklynTheo JamesFour Horsemen of the ApocalypseMonkey Man (film)Atomic bombings of Hiroshima and NagasakiXXXTentacionAaron Taylor-JohnsonClara BowShogunSteve JobsMidnightsTeri Baaton Mein Aisa Uljha JiyaUEFA Euro 2024Shah Rukh KhanThe Age of AdalineFrom (TV series)Red Eye (British TV series)Jelly Roll (singer)Travis ScottBlake LivelyAnsel AdamsAndrew Scott (actor)Jodie ComerDave BautistaCrystal Palace F.C.Bohemian GroveKillers of the Flower Moon (film)Darren WallerOpinion polling for the next United Kingdom general electionXXX (2002 film)GermanyIndian Premier LeagueSoviet UnionSiren (2024 film)Rohit SharmaMarjorie Taylor GreeneEminemSunny LeoneTravis KelceJurassic World DominionMichael J. FoxLiberation Day (Italy)Wayne RooneyZoë KravitzNicole KidmanThe Tortured Poets DepartmentWalton GogginsReggie BushJamie VardyTupac ShakurGoogle ScholarNicholas GalitzineSpice GirlsSandeep WarrierGukesh DXHamsterKaya ScodelarioShōgun (2024 miniseries)Road House (2024 film)FIFA World CupNorovirusHamas🡆 More