Kolmogorov Structure Function

In 1973, Andrey Kolmogorov proposed a non-probabilistic approach to statistics and model selection.

Let each datum be a finite binary string and a model be a finite set of binary strings. Consider model classes consisting of models of given maximal Kolmogorov complexity. The Kolmogorov structure function of an individual data string expresses the relation between the complexity level constraint on a model class and the least log-cardinality of a model in the class containing the data. The structure function determines all stochastic properties of the individual data string: for every constrained model class it determines the individual best-fitting model in the class irrespective of whether the true model is in the model class considered or not. In the classical case we talk about a set of data with a probability distribution, and the properties are those of the expectations. In contrast, here we deal with individual data strings and the properties of the individual string focused on. In this setting, a property holds with certainty rather than with high probability as in the classical case. The Kolmogorov structure function precisely quantifies the goodness-of-fit of an individual model with respect to individual data.

The Kolmogorov structure function is used in the algorithmic information theory, also known as the theory of Kolmogorov complexity, for describing the structure of a string by use of models of increasing complexity.

Kolmogorov's definition

Kolmogorov Structure Function 
Kolmogorov (left) talks on the structure function (see drawing on the blackboard) in (Tallinn, 1973).

The structure function was originally proposed by Kolmogorov in 1973 at a Soviet Information Theory symposium in Tallinn, but these results were not published p. 182. But the results were announced in in 1974, the only written record by Kolmogorov himself. One of his last scientific statements is (translated from the original Russian by L.A. Levin):

To each constructive object corresponds a function Kolmogorov Structure Function  of a natural number k—the log of minimal cardinality of x-containing sets that allow definitions of complexity at most k. If the element x itself allows a simple definition, then the function Kolmogorov Structure Function  drops to 0 even for small k. Lacking such definition, the element is "random" in a negative sense. But it is positively "probabilistically random" only when function Kolmogorov Structure Function  having taken the value Kolmogorov Structure Function  at a relatively small Kolmogorov Structure Function , then changes approximately as Kolmogorov Structure Function .

— Kolmogorov, announcement cited above

Contemporary definition

It is discussed in Cover and Thomas. It is extensively studied in Vereshchagin and Vitányi where also the main properties are resolved. The Kolmogorov structure function can be written as

    Kolmogorov Structure Function 

where Kolmogorov Structure Function  is a binary string of length Kolmogorov Structure Function  with Kolmogorov Structure Function  where Kolmogorov Structure Function  is a contemplated model (set of n-length strings) for Kolmogorov Structure Function , Kolmogorov Structure Function  is the Kolmogorov complexity of Kolmogorov Structure Function  and Kolmogorov Structure Function  is a nonnegative integer value bounding the complexity of the contemplated Kolmogorov Structure Function 's. Clearly, this function is nonincreasing and reaches Kolmogorov Structure Function  for Kolmogorov Structure Function  where Kolmogorov Structure Function  is the required number of bits to change Kolmogorov Structure Function  into Kolmogorov Structure Function  and Kolmogorov Structure Function  is the Kolmogorov complexity of Kolmogorov Structure Function .

The algorithmic sufficient statistic

We define a set Kolmogorov Structure Function  containing Kolmogorov Structure Function  such that

    Kolmogorov Structure Function .

The function Kolmogorov Structure Function  never decreases more than a fixed independent constant below the diagonal called sufficiency line L defined by

    Kolmogorov Structure Function .

It is approached to within a constant distance by the graph of Kolmogorov Structure Function  for certain arguments (for instance, for Kolmogorov Structure Function ). For these Kolmogorov Structure Function 's we have Kolmogorov Structure Function  and the associated model Kolmogorov Structure Function  (witness for Kolmogorov Structure Function ) is called an optimal set for Kolmogorov Structure Function , and its description of Kolmogorov Structure Function  bits is therefore an algorithmic sufficient statistic. We write `algorithmic' for `Kolmogorov complexity' by convention. The main properties of an algorithmic sufficient statistic are the following: If Kolmogorov Structure Function  is an algorithmic sufficient statistic for Kolmogorov Structure Function , then

    Kolmogorov Structure Function .

That is, the two-part description of Kolmogorov Structure Function  using the model Kolmogorov Structure Function  and as data-to-model code the index of Kolmogorov Structure Function  in the enumeration of Kolmogorov Structure Function  in Kolmogorov Structure Function  bits, is as concise as the shortest one-part code of Kolmogorov Structure Function  in Kolmogorov Structure Function  bits. This can be easily seen as follows:

    Kolmogorov Structure Function ,
Kolmogorov Structure Function 
Structure functions Kolmogorov Structure Function  and minimal sufficient statistic.

using straightforward inequalities and the sufficiency property, we find that Kolmogorov Structure Function . (For example, given Kolmogorov Structure Function , we can describe Kolmogorov Structure Function  self-delimitingly (you can determine its end) in Kolmogorov Structure Function  bits.) Therefore, the randomness deficiency Kolmogorov Structure Function  of Kolmogorov Structure Function  in Kolmogorov Structure Function  is a constant, which means that Kolmogorov Structure Function  is a typical (random) element of S. However, there can be models Kolmogorov Structure Function  containing Kolmogorov Structure Function  that are not sufficient statistics. An algorithmic sufficient statistic Kolmogorov Structure Function  for Kolmogorov Structure Function  has the additional property, apart from being a model of best fit, that Kolmogorov Structure Function  and therefore by the Kolmogorov complexity symmetry of information (the information about Kolmogorov Structure Function  in Kolmogorov Structure Function  is about the same as the information about Kolmogorov Structure Function  in x) we have Kolmogorov Structure Function : the algorithmic sufficient statistic Kolmogorov Structure Function  is a model of best fit that is almost completely determined by Kolmogorov Structure Function . (Kolmogorov Structure Function  is a shortest program for Kolmogorov Structure Function .) The algorithmic sufficient statistic associated with the least such Kolmogorov Structure Function  is called the algorithmic minimal sufficient statistic.

With respect to the picture: The MDL structure function Kolmogorov Structure Function  is explained below. The Goodness-of-fit structure function Kolmogorov Structure Function  is the least randomness deficiency (see above) of any model Kolmogorov Structure Function  for Kolmogorov Structure Function  such that Kolmogorov Structure Function . This structure function gives the goodness-of-fit of a model Kolmogorov Structure Function  (containing x) for the string x. When it is low the model fits well, and when it is high the model doesn't fit well. If Kolmogorov Structure Function  for some Kolmogorov Structure Function  then there is a typical model Kolmogorov Structure Function  for Kolmogorov Structure Function  such that Kolmogorov Structure Function  and Kolmogorov Structure Function  is typical (random) for S. That is, Kolmogorov Structure Function  is the best-fitting model for x. For more details see and especially and.

Selection of properties

Within the constraints that the graph goes down at an angle of at least 45 degrees, that it starts at n and ends approximately at Kolmogorov Structure Function , every graph (up to a Kolmogorov Structure Function  additive term in argument and value) is realized by the structure function of some data x and vice versa. Where the graph hits the diagonal first the argument (complexity) is that of the minimum sufficient statistic. It is incomputable to determine this place. See.

Main property

It is proved that at each level Kolmogorov Structure Function  of complexity the structure function allows us to select the best model Kolmogorov Structure Function  for the individual string x within a strip of Kolmogorov Structure Function  with certainty, not with great probability.

The MDL variant

The Minimum description length (MDL) function: The length of the minimal two-part code for x consisting of the model cost K(S) and the length of the index of x in S, in the model class of sets of given maximal Kolmogorov complexity Kolmogorov Structure Function , the complexity of S upper bounded by Kolmogorov Structure Function , is given by the MDL function or constrained MDL estimator:

    Kolmogorov Structure Function 

where Kolmogorov Structure Function  is the total length of two-part code of x with help of model S.

Main property

It is proved that at each level Kolmogorov Structure Function  of complexity the structure function allows us to select the best model S for the individual string x within a strip of Kolmogorov Structure Function  with certainty, not with great probability.

Application in statistics

The mathematics developed above were taken as the foundation of MDL by its inventor Jorma Rissanen.

Probability models

For every computable probability distribution Kolmogorov Structure Function  it can be proved that

    Kolmogorov Structure Function .

For example, if Kolmogorov Structure Function  is some computable distribution on the set Kolmogorov Structure Function  of strings of length Kolmogorov Structure Function , then each Kolmogorov Structure Function  has probability Kolmogorov Structure Function . Kolmogorov's structure function becomes

    Kolmogorov Structure Function 

where x is a binary string of length n with Kolmogorov Structure Function  where Kolmogorov Structure Function  is a contemplated model (computable probability of Kolmogorov Structure Function -length strings) for Kolmogorov Structure Function , Kolmogorov Structure Function  is the Kolmogorov complexity of Kolmogorov Structure Function  and Kolmogorov Structure Function  is an integer value bounding the complexity of the contemplated Kolmogorov Structure Function 's. Clearly, this function is non-increasing and reaches Kolmogorov Structure Function  for Kolmogorov Structure Function  where c is the required number of bits to change Kolmogorov Structure Function  into Kolmogorov Structure Function  and Kolmogorov Structure Function  is the Kolmogorov complexity of Kolmogorov Structure Function . Then Kolmogorov Structure Function . For every complexity level Kolmogorov Structure Function  the function Kolmogorov Structure Function  is the Kolmogorov complexity version of the maximum likelihood (ML).

Main property

It is proved that at each level Kolmogorov Structure Function  of complexity the structure function allows us to select the best model Kolmogorov Structure Function  for the individual string Kolmogorov Structure Function  within a strip of Kolmogorov Structure Function  with certainty, not with great probability.

The MDL variant and probability models

The MDL function: The length of the minimal two-part code for x consisting of the model cost K(P) and the length of Kolmogorov Structure Function , in the model class of computable probability mass functions of given maximal Kolmogorov complexity Kolmogorov Structure Function , the complexity of P upper bounded by Kolmogorov Structure Function , is given by the MDL function or constrained MDL estimator:

    Kolmogorov Structure Function 

where Kolmogorov Structure Function  is the total length of two-part code of x with help of model P.

Main property

It is proved that at each level Kolmogorov Structure Function  of complexity the MDL function allows us to select the best model P for the individual string x within a strip of Kolmogorov Structure Function  with certainty, not with great probability.

Extension to rate distortion and denoising

It turns out that the approach can be extended to a theory of rate distortion of individual finite sequences and denoising of individual finite sequences using Kolmogorov complexity. Experiments using real compressor programs have been carried out with success. Here the assumption is that for natural data the Kolmogorov complexity is not far from the length of a compressed version using a good compressor.

References

Literature

Tags:

Kolmogorov Structure Function Kolmogorovs definitionKolmogorov Structure Function Contemporary definitionKolmogorov Structure Function The MDL variantKolmogorov Structure Function Probability modelsKolmogorov Structure Function The MDL variant and probability modelsKolmogorov Structure Function Extension to rate distortion and denoisingKolmogorov Structure Function LiteratureKolmogorov Structure FunctionAndrey KolmogorovKolmogorov complexityStochastic

🔥 Trending searches on Wiki English:

The WeekndMrs Chatterjee Vs NorwayMatthew McConaugheyRRR (film)United StatesXXX (2002 film)Joe BidenFC Bayern MunichThe Night AgentEurovision Song Contest 2023Chris HemsworthCanelo ÁlvarezNicholas BraunCocaine BearAubrey PlazaBenjamin NetanyahuRam CharanRory CulkinTabu (actress)Shazam! (film)Mike TysonIOSAnimalReese WitherspoonDonald SutherlandGrace Caroline CurreyMighty Morphin Power RangersKiefer SutherlandThuy TrangRothschild familyLorem ipsumRodney TerryMicrosoft Exchange ServerTimothy McVeighMexicoJenna OrtegaBen Foster (footballer)SexNorth America14th Dalai LamaChatGPTDavid BowieAnd Then There Were NoneAshley Johnson (actress)Stephen HawkingAKim KardashianRahul GandhiWorld War IRyan PhillippeNeatsville, KentuckyCzech RepublicFIFA World CupRupert MurdochNikki Catsouras photographs controversyGermanyJonah HillNetherlandsScream (2022 film)Will SmithTardigradeDonald TrumpIsaac HerzogAnthony VolpeDavid BenavidezBecky GJim CaviezelDisappearance of Madeleine McCannJack BlackItamar Ben-GvirRuud van NistelrooyRey MysterioShrek (franchise)Annie Lennox🡆 More