Von Mises–Fisher Distribution

In directional statistics, the von Mises–Fisher distribution (named after Richard von Mises and Ronald Fisher), is a probability distribution on the ( p − 1 ) -sphere in R p ^} .

If the distribution reduces to the von Mises distribution on the circle.

Definition

The probability density function of the von Mises–Fisher distribution for the random p-dimensional unit vector Von Mises–Fisher Distribution  is given by:

    Von Mises–Fisher Distribution 

where Von Mises–Fisher Distribution  and the normalization constant Von Mises–Fisher Distribution  is equal to

    Von Mises–Fisher Distribution 

where Von Mises–Fisher Distribution  denotes the modified Bessel function of the first kind at order Von Mises–Fisher Distribution . If Von Mises–Fisher Distribution , the normalization constant reduces to

    Von Mises–Fisher Distribution 

The parameters Von Mises–Fisher Distribution  and Von Mises–Fisher Distribution  are called the mean direction and concentration parameter, respectively. The greater the value of Von Mises–Fisher Distribution , the higher the concentration of the distribution around the mean direction Von Mises–Fisher Distribution . The distribution is unimodal for Von Mises–Fisher Distribution , and is uniform on the sphere for Von Mises–Fisher Distribution .

The von Mises–Fisher distribution for Von Mises–Fisher Distribution  is also called the Fisher distribution. It was first used to model the interaction of electric dipoles in an electric field. Other applications are found in geology, bioinformatics, and text mining.

Note on the normalization constant

In the textbook, Directional Statistics by Mardia and Jupp, the normalization constant given for the Von Mises Fisher probability density is apparently different from the one given here: Von Mises–Fisher Distribution . In that book, the normalization constant is specified as:

    Von Mises–Fisher Distribution 

where Von Mises–Fisher Distribution  is the gamma function. This is resolved by noting that Mardia and Jupp give the density "with respect to the uniform distribution", while the density here is specified in the usual way, with respect to Lebesgue measure. The density (w.r.t. Lebesgue measure) of the uniform distribution is the reciprocal of the surface area of the (p-1)-sphere, so that the uniform density function is given by the constant:

    Von Mises–Fisher Distribution 

It then follows that:

    Von Mises–Fisher Distribution 

While the value for Von Mises–Fisher Distribution  was derived above via the surface area, the same result may be obtained by setting Von Mises–Fisher Distribution  in the above formula for Von Mises–Fisher Distribution . This can be done by noting that the series expansion for Von Mises–Fisher Distribution  divided by Von Mises–Fisher Distribution  has but one non-zero term at Von Mises–Fisher Distribution . (To evaluate that term, one needs to use the definition Von Mises–Fisher Distribution .)

Support

The support of the Von Mises–Fisher distribution is the hypersphere, or more specifically, the Von Mises–Fisher Distribution -sphere, denoted as

    Von Mises–Fisher Distribution 

This is a Von Mises–Fisher Distribution -dimensional manifold embedded in Von Mises–Fisher Distribution -dimensional Euclidean space, Von Mises–Fisher Distribution .

Relation to normal distribution

Starting from a normal distribution with isotropic covariance Von Mises–Fisher Distribution  and mean Von Mises–Fisher Distribution  of length Von Mises–Fisher Distribution , whose density function is:

    Von Mises–Fisher Distribution 

the Von Mises–Fisher distribution is obtained by conditioning on Von Mises–Fisher Distribution . By expanding

    Von Mises–Fisher Distribution 

and using the fact that the first two right-hand-side terms are fixed, the Von Mises-Fisher density, Von Mises–Fisher Distribution  is recovered by recomputing the normalization constant by integrating Von Mises–Fisher Distribution  over the unit sphere. If Von Mises–Fisher Distribution , we get the uniform distribution, with density Von Mises–Fisher Distribution .

More succinctly, the restriction of any isotropic multivariate normal density to the unit hypersphere, gives a Von Mises-Fisher density, up to normalization.

This construction can be generalized by starting with a normal distribution with a general covariance matrix, in which case conditioning on Von Mises–Fisher Distribution  gives the Fisher-Bingham distribution.

Estimation of parameters

Mean direction

A series of N independent unit vectors Von Mises–Fisher Distribution  are drawn from a von Mises–Fisher distribution. The maximum likelihood estimates of the mean direction Von Mises–Fisher Distribution  is simply the normalized arithmetic mean, a sufficient statistic:

    Von Mises–Fisher Distribution 

Concentration parameter

Use the modified Bessel function of the first kind to define

    Von Mises–Fisher Distribution 

Then:

    Von Mises–Fisher Distribution 

Thus Von Mises–Fisher Distribution  is the solution to

    Von Mises–Fisher Distribution 

A simple approximation to Von Mises–Fisher Distribution  is (Sra, 2011)

    Von Mises–Fisher Distribution 

A more accurate inversion can be obtained by iterating the Newton method a few times

    Von Mises–Fisher Distribution 
    Von Mises–Fisher Distribution 

For N ≥ 25, the estimated spherical standard error of the sample mean direction can be computed as:

    Von Mises–Fisher Distribution 

where

    Von Mises–Fisher Distribution 

It is then possible to approximate a Von Mises–Fisher Distribution  a spherical confidence interval (a confidence cone) about Von Mises–Fisher Distribution  with semi-vertical angle:

    Von Mises–Fisher Distribution  where Von Mises–Fisher Distribution 

For example, for a 95% confidence cone, Von Mises–Fisher Distribution  and thus Von Mises–Fisher Distribution 

Expected value

The expected value of the Von Mises–Fisher distribution is not on the unit hypersphere, but instead has a length of less than one. This length is given by Von Mises–Fisher Distribution  as defined above. For a Von Mises–Fisher distribution with mean direction Von Mises–Fisher Distribution  and concentration Von Mises–Fisher Distribution , the expected value is:

    Von Mises–Fisher Distribution .

For Von Mises–Fisher Distribution , the expected value is at the origin. For finite Von Mises–Fisher Distribution , the length of the expected value, is strictly between zero and one and is a monotonic rising function of Von Mises–Fisher Distribution .

The empirical mean (arithmetic average) of a collection of points on the unit hypersphere behaves in a similar manner, being close to the origin for widely spread data and close to the sphere for concentrated data. Indeed, for the Von Mises–Fisher distribution, the expected value of the maximum-likelihood estimate based on a collection of points is equal to the empirical mean of those points.

Entropy and KL divergence

The expected value can be used to compute differential entropy and KL divergence.

The differential entropy of Von Mises–Fisher Distribution  is:

    Von Mises–Fisher Distribution 

where the angle brackets denote expectation. Notice that the entropy is a function of Von Mises–Fisher Distribution  only.

The KL divergence between Von Mises–Fisher Distribution  and Von Mises–Fisher Distribution  is:

    Von Mises–Fisher Distribution 

Transformation

Von Mises-Fisher (VMF) distributions are closed under orthogonal linear transforms. Let Von Mises–Fisher Distribution  be a Von Mises–Fisher Distribution -by-Von Mises–Fisher Distribution  orthogonal matrix. Let Von Mises–Fisher Distribution  and apply the invertible linear transform: Von Mises–Fisher Distribution . The inverse transform is Von Mises–Fisher Distribution , because the inverse of an orthogonal matrix is its transpose: Von Mises–Fisher Distribution . The Jacobian of the transform is Von Mises–Fisher Distribution , for which the absolute value of its determinant is 1, also because of the orthogonality. Using these facts and the form of the VMF density, it follows that:

    Von Mises–Fisher Distribution 

One may verify that since Von Mises–Fisher Distribution  and Von Mises–Fisher Distribution  are unit vectors, then by the orthogonality, so are Von Mises–Fisher Distribution  and Von Mises–Fisher Distribution .

Pseudo-random number generation

General case

An algorithm for drawing pseudo-random samples from the Von Mises Fisher (VMF) distribution was given by Ulrich and later corrected by Wood. An implementation in R is given by Hornik and Grün; and a fast Python implementation is described by Pinzón and Jung.

To simulate from a VMF distribution on the Von Mises–Fisher Distribution -dimensional unitsphere, Von Mises–Fisher Distribution , with mean direction Von Mises–Fisher Distribution , these algorithms use the following radial-tangential decomposition for a point Von Mises–Fisher Distribution  :

    Von Mises–Fisher Distribution 

where Von Mises–Fisher Distribution  lives in the tangential Von Mises–Fisher Distribution -dimensional unit-subsphere that is centered at and perpendicular to Von Mises–Fisher Distribution ; while Von Mises–Fisher Distribution . To draw a sample Von Mises–Fisher Distribution  from a VMF with parameters Von Mises–Fisher Distribution  and Von Mises–Fisher Distribution , Von Mises–Fisher Distribution  must be drawn from the uniform distribution on the tangential subsphere; and the radial component, Von Mises–Fisher Distribution , must be drawn independently from the distribution with density:

    Von Mises–Fisher Distribution 

where Von Mises–Fisher Distribution . The normalization constant for this density may be verified by using:

    Von Mises–Fisher Distribution 

as given in Appendix 1 (A.3) in Directional Statistics. Drawing the Von Mises–Fisher Distribution  samples from this density by using a rejection sampling algorithm is explained in the above references. To draw the uniform Von Mises–Fisher Distribution  samples perpendicular to Von Mises–Fisher Distribution , see the algorithm in, or otherwise a Householder transform can be used as explained in Algorithm 1 in.

3-D sphere

To generate a Von Mises–Fisher distributed pseudo-random spherical 3-D unit vector Von Mises–Fisher Distribution  on the Von Mises–Fisher Distribution sphere for a given Von Mises–Fisher Distribution  and Von Mises–Fisher Distribution , define

Von Mises–Fisher Distribution 

where Von Mises–Fisher Distribution  is the polar angle, Von Mises–Fisher Distribution  the azimuthal angle, and Von Mises–Fisher Distribution  the distance to the center of the sphere

for Von Mises–Fisher Distribution  the pseudo-random triplet is then given by

Von Mises–Fisher Distribution 

where Von Mises–Fisher Distribution  is sampled from the continuous uniform distribution Von Mises–Fisher Distribution  with lower bound Von Mises–Fisher Distribution  and upper bound Von Mises–Fisher Distribution 

Von Mises–Fisher Distribution 

and

Von Mises–Fisher Distribution 

where Von Mises–Fisher Distribution  is sampled from the standard continuous uniform distribution Von Mises–Fisher Distribution 

Von Mises–Fisher Distribution 

here, Von Mises–Fisher Distribution should be set to Von Mises–Fisher Distribution  when Von Mises–Fisher Distribution  and Von Mises–Fisher Distribution  rotated to match any other desired Von Mises–Fisher Distribution .

Distribution of polar angle

For Von Mises–Fisher Distribution , the angle θ between Von Mises–Fisher Distribution  and Von Mises–Fisher Distribution  satisfies Von Mises–Fisher Distribution . It has the distribution

    Von Mises–Fisher Distribution ,

which can be easily evaluated as

    Von Mises–Fisher Distribution .

For the general case, Von Mises–Fisher Distribution , the distribution for the cosine of this angle:

    Von Mises–Fisher Distribution 

is given by Von Mises–Fisher Distribution , as explained above.

The uniform hypersphere distribution

When Von Mises–Fisher Distribution , the Von Mises–Fisher distribution, Von Mises–Fisher Distribution  on Von Mises–Fisher Distribution  simplifies to the uniform distribution on Von Mises–Fisher Distribution . The density is constant with value Von Mises–Fisher Distribution . Pseudo-random samples can be generated by generating samples in Von Mises–Fisher Distribution  from the standard multivariate normal distribution, followed by normalization to unit norm.

Component marginal of uniform distribution

For Von Mises–Fisher Distribution , let Von Mises–Fisher Distribution  be any component of Von Mises–Fisher Distribution . The marginal distribution for Von Mises–Fisher Distribution  has the density:

    Von Mises–Fisher Distribution 

where Von Mises–Fisher Distribution  is the beta function. This distribution may be better understood by highlighting its relation to the beta distribution:

    Von Mises–Fisher Distribution 

where the Legendre duplication formula is useful to understand the relationships between the normalization constants of the various densities above.

Note that the components of Von Mises–Fisher Distribution  are not independent, so that the uniform density is not the product of the marginal densities; and Von Mises–Fisher Distribution  cannot be assembled by independent sampling of the components.

Distribution of dot-products

In machine learning, especially in image classification, to-be-classified inputs (e.g. images) are often compared using cosine similarity, which is the dot product between intermediate representations in the form of unitvectors (termed embeddings). The dimensionality is typically high, with Von Mises–Fisher Distribution  at least several hundreds. The deep neural networks that extract embeddings for classification should learn to spread the classes as far apart as possible and ideally this should give classes that are uniformly distributed on Von Mises–Fisher Distribution . For a better statistical understanding of across-class cosine similarity, the distribution of dot-products between unitvectors independently sampled from the uniform distribution may be helpful.


Let Von Mises–Fisher Distribution  be unitvectors in Von Mises–Fisher Distribution , independently sampled from the uniform distribution. Define:

    Von Mises–Fisher Distribution 

where Von Mises–Fisher Distribution  is the dot-product and Von Mises–Fisher Distribution  are transformed versions of it. Then the distribution for Von Mises–Fisher Distribution  is the same as the marginal component distribution given above; the distribution for Von Mises–Fisher Distribution  is symmetric beta and the distribution for Von Mises–Fisher Distribution  is symmetric logistic-beta:

    Von Mises–Fisher Distribution 

The means and variances are:

    Von Mises–Fisher Distribution 

and

    Von Mises–Fisher Distribution 

where Von Mises–Fisher Distribution  is the first polygamma function. The variances decrease, the distributions of all three variables become more Gaussian, and the final approximation gets better as the dimensionality, Von Mises–Fisher Distribution , is increased.

Generalizations

Matrix Von Mises-Fisher

The matrix von Mises-Fisher distribution (also known as matrix Langevin distribution) has the density

    Von Mises–Fisher Distribution 

supported on the Stiefel manifold of Von Mises–Fisher Distribution  orthonormal p-frames Von Mises–Fisher Distribution , where Von Mises–Fisher Distribution  is an arbitrary Von Mises–Fisher Distribution  real matrix.

Saw distributions

Ulrich, in designing an algorithm for sampling from the VMF distribution, makes use of a family of distributions named after and explored by John G. Saw. A Saw distribution is a distribution on the Von Mises–Fisher Distribution -sphere, Von Mises–Fisher Distribution , with modal vector Von Mises–Fisher Distribution  and concentration Von Mises–Fisher Distribution , and of which the density function has the form:

    Von Mises–Fisher Distribution 

where Von Mises–Fisher Distribution  is a non-negative, increasing function; and where Von Mises–Fisher Distribution  is the normalization constant. The above-mentioned radial-tangential decomposition generalizes to the Saw family and the radial compoment, Von Mises–Fisher Distribution  has the density:

    Von Mises–Fisher Distribution 

where Von Mises–Fisher Distribution  is the beta function. Also notice that the left-hand factor of the radial density is the surface area of Von Mises–Fisher Distribution .

By setting Von Mises–Fisher Distribution , one recovers the VMF distribution.

See also

References

Further reading

  • Dhillon, I., Sra, S. (2003) "Modeling Data using Directional Distributions". Tech. rep., University of Texas, Austin.
  • Banerjee, A., Dhillon, I. S., Ghosh, J., & Sra, S. (2005). "Clustering on the unit hypersphere using von Mises-Fisher distributions". Journal of Machine Learning Research, 6(Sep), 1345-1382.
  • Sra, S. (2011). "A short note on parameter approximation for von Mises-Fisher distributions: And a fast implementation of I_s(x)". Computational Statistics. 27: 177–190. CiteSeerX 10.1.1.186.1887. doi:10.1007/s00180-011-0232-x. S2CID 3654195.

Tags:

Von Mises–Fisher Distribution DefinitionVon Mises–Fisher Distribution Relation to normal distributionVon Mises–Fisher Distribution Estimation of parametersVon Mises–Fisher Distribution Expected valueVon Mises–Fisher Distribution Entropy and KL divergenceVon Mises–Fisher Distribution TransformationVon Mises–Fisher Distribution Pseudo-random number generationVon Mises–Fisher Distribution Distribution of polar angleVon Mises–Fisher Distribution The uniform hypersphere distributionVon Mises–Fisher Distribution GeneralizationsVon Mises–Fisher Distribution Further readingVon Mises–Fisher DistributionCircleDirectional statisticsN-sphereProbability distributionRichard von MisesRonald FisherVon Mises distribution

🔥 Trending searches on Wiki English:

Josh O'ConnorWorld Chess Championship 2024Pirates of the Caribbean (film series)HamasJoe ManganielloFallout (series)Hong KongTapiocaWish (film)Jimmy Carter2024 Mutua Madrid Open – Men's singlesStormy DanielsDeadpool (film)Wiki FoundationElon MuskShivam DubeRipley (TV series)2019 Indian general electionGary GlitterAshley YoungLuke KleintankMaldivesValerie BertinelliDrake (musician)Terry HillPaveway IVMamitha BaijuKendrick LamarItalyJapanArun Jaitley Cricket StadiumChennai Super KingsShohei Ohtani2024 Indian general election in KeralaNapoleonChanning TatumGhoul (Fallout)Blink TwiceCole PalmerGlen PowellKelsey PlumYoung SheldonJake Paul vs. Mike TysonDonald TrumpKim KardashianTheodore RooseveltThe GodfatherEredivisieSylvester StalloneShōgun (novel)Georgia (country)Richard NixonGodzilla Minus OneWinston ChurchillList of Billboard Hot 100 number ones of 2023Edo periodAmy SchumerDarién GapWorld War INelson MandelaCarnation RevolutionMikel ArtetaPlanet of the ApesAdolf HitlerRihannaTravis KelceSerena WilliamsList of European Cup and UEFA Champions League finalsAnyone but YouRobert Kraft27 ClubEurovision Song Contest 2024Stephen CurryTerence CrawfordMoonTom CruiseYandexAndrew Scott (actor)🡆 More