Least Squares Inference In Phylogeny

Least squares inference in phylogeny generates a phylogenetic tree based on an observed matrix of pairwise genetic distances and optionally a weight matrix.

The goal is to find a tree which satisfies the distance constraints as best as possible.

Ordinary and weighted least squares

The discrepancy between the observed pairwise distances Least Squares Inference In Phylogeny  and the distances Least Squares Inference In Phylogeny  over a phylogenetic tree (i.e. the sum of the branch lengths in the path from leaf Least Squares Inference In Phylogeny  to leaf Least Squares Inference In Phylogeny ) is measured by

    Least Squares Inference In Phylogeny 

where the weights Least Squares Inference In Phylogeny  depend on the least squares method used. Least squares distance tree construction aims to find the tree (topology and branch lengths) with minimal S. This is a non-trivial problem. It involves searching the discrete space of unrooted binary tree topologies whose size is exponential in the number of leaves. For n leaves there are 1 • 3 • 5 • ... • (2n-3) different topologies. Enumerating them is not feasible already for a small number of leaves. Heuristic search methods are used to find a reasonably good topology. The evaluation of S for a given topology (which includes the computation of the branch lengths) is a linear least squares problem. There are several ways to weight the squared errors Least Squares Inference In Phylogeny , depending on the knowledge and assumptions about the variances of the observed distances. When nothing is known about the errors, or if they are assumed to be independently distributed and equal for all observed distances, then all the weights Least Squares Inference In Phylogeny  are set to one. This leads to an ordinary least squares estimate. In the weighted least squares case the errors are assumed to be independent (or their correlations are not known). Given independent errors, a particular weight should ideally be set to the inverse of the variance of the corresponding distance estimate. Sometimes the variances may not be known, but they can be modeled as a function of the distance estimates. In the Fitch and Margoliash method for instance it is assumed that the variances are proportional to the squared distances.

Generalized least squares

The ordinary and weighted least squares methods described above assume independent distance estimates. If the distances are derived from genomic data their estimates covary, because evolutionary events on internal branches (of the true tree) can push several distances up or down at the same time. The resulting covariances can be taken into account using the method of generalized least squares, i.e. minimizing the following quantity

    Least Squares Inference In Phylogeny 

where Least Squares Inference In Phylogeny  are the entries of the inverse of the covariance matrix of the distance estimates.

Computational Complexity

Finding the tree and branch lengths minimizing the least squares residual is an NP-complete problem. However, for a given tree, the optimal branch lengths can be determined in Least Squares Inference In Phylogeny  time for ordinary least squares, Least Squares Inference In Phylogeny  time for weighted least squares, and Least Squares Inference In Phylogeny  time for generalised least squares (given the inverse of the covariance matrix).

  • PHYLIP, a freely distributed phylogenetic analysis package containing an implementation of the weighted least squares method
  • PAUP, a similar package available for purchase
  • Darwin, a programming environment with a library of functions for statistics, numerics, sequence and phylogenetic analysis

References

Tags:

Least Squares Inference In Phylogeny Ordinary and weighted least squaresLeast Squares Inference In Phylogeny Generalized least squaresLeast Squares Inference In Phylogeny Computational ComplexityLeast Squares Inference In PhylogenyGenetic distancePhylogenetic tree

🔥 Trending searches on Wiki English:

Dune (franchise)Pankaj TripathiHong KongAnna FreudMillie Bobby BrownItalyChallengers (film)Anya Taylor-JoyThe Gentlemen (2019 film)Split (2016 American film)Google TranslateTyler ReddickHenry VIIIMain PageEminem2024 Formula One World ChampionshipStephen CurryJennifer GarnerBluey (2018 TV series)Olivia RodrigoLondonHeartbreak High (2022 TV series)2024 Indian general election in MaharashtraPrince (musician)SexAlex PereiraBarry KeoghanMinouche ShafikR PraggnanandhaaMarvel Cinematic UniverseJaron EnnisNicole Brown SimpsonScarlett JohanssonBenjamin FranklinBiggest ball of twineList of states and territories of the United StatesLuka ModrićLeslie UggamsAir France Flight 447Nicolas CageJoaquin PhoenixHamasThe Tortured Poets DepartmentWalton GogginsDubaiJimmy CarrSeven deadly sinsYami GautamLewis HamiltonRobloxHangout with YooYellowstone (American TV series)The Witcher (TV series)Women's Candidates Tournament 2024BundesligaKaty PerryElizabeth IICoachellaThe SympathizerGuy RitchieArsenal F.C.Drake (musician)I, Robot (film)Hiroyuki SanadaList of countries by GDP (nominal)Chappell RoanJalen WilliamsChris JerichoAtomic bombings of Hiroshima and NagasakiBørsenList of constituencies of the Lok SabhaThe Gentlemen (2024 TV series)2024 Indian general election in Tamil NaduDiljit DosanjhAnne HathawayRwandaFlorence PughJapan🡆 More