Tree rearrangement
Encyclopedia
Tree rearrangements are used in heuristic
algorithm
s devoted to searching for an optimal
tree structure
. They can be applied to any set of data that are naturally arranged into a tree, but have most applications in computational phylogenetics
, especially in maximum parsimony
and maximum likelihood
searches of phylogenetic tree
s, which seek to identify one among many possible trees that best explains the evolution
ary history of a particular gene
or species
.
s with a defined objective function to swap high-scoring subtrees into main trees that are high-scoring overall.
Heuristic
Heuristic refers to experience-based techniques for problem solving, learning, and discovery. Heuristic methods are used to speed up the process of finding a satisfactory solution, where an exhaustive search is impractical...
algorithm
Algorithm
In mathematics and computer science, an algorithm is an effective method expressed as a finite list of well-defined instructions for calculating a function. Algorithms are used for calculation, data processing, and automated reasoning...
s devoted to searching for an optimal
Optimization (mathematics)
In mathematics, computational science, or management science, mathematical optimization refers to the selection of a best element from some set of available alternatives....
tree structure
Tree structure
A tree structure is a way of representing the hierarchical nature of a structure in a graphical form. It is named a "tree structure" because the classic representation resembles a tree, even though the chart is generally upside down compared to an actual tree, with the "root" at the top and the...
. They can be applied to any set of data that are naturally arranged into a tree, but have most applications in computational phylogenetics
Computational phylogenetics
Computational phylogenetics is the application of computational algorithms, methods and programs to phylogenetic analyses. The goal is to assemble a phylogenetic tree representing a hypothesis about the evolutionary ancestry of a set of genes, species, or other taxa...
, especially in maximum parsimony
Maximum parsimony
Parsimony is a non-parametric statistical method commonly used in computational phylogenetics for estimating phylogenies. Under parsimony, the preferred phylogenetic tree is the tree that requires the least evolutionary change to explain some observed data....
and maximum likelihood
Maximum likelihood
In statistics, maximum-likelihood estimation is a method of estimating the parameters of a statistical model. When applied to a data set and given a statistical model, maximum-likelihood estimation provides estimates for the model's parameters....
searches of phylogenetic tree
Phylogenetic tree
A phylogenetic tree or evolutionary tree is a branching diagram or "tree" showing the inferred evolutionary relationships among various biological species or other entities based upon similarities and differences in their physical and/or genetic characteristics...
s, which seek to identify one among many possible trees that best explains the evolution
Evolution
Evolution is any change across successive generations in the heritable characteristics of biological populations. Evolutionary processes give rise to diversity at every level of biological organisation, including species, individual organisms and molecules such as DNA and proteins.Life on Earth...
ary history of a particular gene
Gene
A gene is a molecular unit of heredity of a living organism. It is a name given to some stretches of DNA and RNA that code for a type of protein or for an RNA chain that has a function in the organism. Living beings depend on genes, as they specify all proteins and functional RNA chains...
or species
Species
In biology, a species is one of the basic units of biological classification and a taxonomic rank. A species is often defined as a group of organisms capable of interbreeding and producing fertile offspring. While in many cases this definition is adequate, more precise or differing measures are...
.
Basic tree rearrangements
The simplest tree-rearrangement, known as nearest-neighbor interchange, exchanges the connectivity of four subtrees within the main tree. Because there are three possible ways of connecting four subtrees, and one is the original connectivity, each interchange creates two new trees. Exhaustively searching the possible nearest-neighbors for each possible set of subtrees is the slowest but most optimizing way of performing this search. An alternative, more wide-ranging search, subtree pruning and regrafting (SPR), selects and removes a subtree from the main tree and reinserts it elsewhere on the main tree to create a new node. Finally, tree bisection and reconnection (TBR) detaches a subtree from the main tree at an interior node and then attempts all possible connections between branches of the two trees thus created. The increasing complexity of the tree rearrangement technique correlates with increasing computational time required for the search, although not necessarily with their performance.Tree fusion
The simplest type of tree fusion begins with two trees already identified as near-optimal; thus, they most likely have the majority of their nodes correct but may fail to resolve individual tree "leaves" properly; for example, the separation ((A,B),(C,D)) at a branch tip versus ((A,C),(B,D)) may be unresolved. Tree fusion swaps these two solutions between two otherwise near-optimal trees. Variants of the method use standard genetic algorithmGenetic algorithm
A genetic algorithm is a search heuristic that mimics the process of natural evolution. This heuristic is routinely used to generate useful solutions to optimization and search problems...
s with a defined objective function to swap high-scoring subtrees into main trees that are high-scoring overall.