Skip to content

ms609/Quartet

Repository files navigation

Project Status: Inactive. Build Status codecov CRAN Status Badge CRAN Downloads DOI

Quartet

'Quartet' is an R package that calculates the quartet distance between two trees (Estabrook et al. 1985), a measure of their similarity based on the number of shared four-taxon subtrees.

The quartet distance outperforms a number of widely used tree distances (e.g. the Robinson–Foulds, path, and rearrangement distances) against a number theoretical and practical measures (Steel & Penny 1993; Smith 2020), and is particularly valuable in the construction of tree spaces (Smith 2021). It can also be used to produce consensus trees that display more resolution than standard Robinson–Foulds-based majority-rule trees (via QuartetConsensus()) (Takazawa et al. 2026).

'Quartet' uses the 'tqDist' algorithm (Brodal et al. 2004; Sand et al. 2014) to compute quartet distances, and using the CPDT algorithm of Jansson & Rajaby (2017) to compute triplet distances (via TripletDistance()). Unlike many other implementations, it distinguishes between quartets that are contradicted by one tree, and quartets that are simply absent due to a lack of resolution (i.e. the presence of polytomies; see Smith 2019). 'Quartet' makes this distinction in both the quartet metric (function QuartetStatus()) and the partition metric (i.e. Robinson-Foulds distance; function SplitStatus()).

Using the package

Install and load the library from CRAN as follows:

install.packages('Quartet')
library('Quartet')

For the latest features, install the development version:

if(!require("curl")) install.packages("curl")
if(!require("remotes")) install.packages("remotes")
remotes::install_github("ms609/Quartet")

You will need Rtools installed in order to build the development version from source.

View the function reference and basic usage instructions.

Known limitations

Quartet supports trees with up to 477 leaves. Larger trees contain more quartets than can be represented by R's signed 32-bit integers. The underlying 'tqDist' library may handle trees with up to 568 leaves, and 64-bit integer representations could increase this number further.
Making either of these improvements within the R package would require substantial additional work, but could be implemented -- do file an issue if this would be useful to you.

References

  • Brodal G.S., Fagerberg R., Pedersen C.N.S. 2004. Computing the quartet distance between evolutionary trees in time O(n log n). Algorithmica 38:377–395.

  • Estabrook G.F., McMorris F.R., Meacham C.A. 1985. Comparison of undirected phylogenetic trees based on subtrees of four evolutionary units. Syst. Zool. 34:193–200.

  • Jansson J., Rajaby R. 2017. A more practical algorithm for the rooted triplet distance. J. Comput. Biol. 24:106–126. https://doi.org/10.1089/cmb.2016.0185

  • Sand A., Holt M.K., Johansen J., Brodal G.S., Mailund T., Pedersen C.N.S. 2014. tqDist: a library for computing the quartet and triplet distances between binary or general trees. Bioinformatics 30:2079–2080. https://doi.org/10.1093/bioinformatics/btu157

  • Smith, M.R. 2019. Bayesian and parsimony approaches reconstruct informative trees from simulated morphological datasets. Biol. Lett. 15:20180632. https://doi.org/10.1098/rsbl.2018.0632

  • Smith, M.R. 2020. Information theoretic generalized Robinson–Foulds metrics for comparing phylogenetic trees. Bioinformatics, 36:5007–5013. https://dx.doi.org/10.1093/bioinformatics/btaa614

  • Smith, M.R. 2022. Robust analysis of phylogenetic tree space. Systematic Biology, syab100. https://dx.doi.org/10.1093/sysbio/syab100

  • Steel, M. and Penny, D. 1993. Distributions of tree comparison metrics: some new results. Syst. Biol. 42: 126-141. https://doi.org/10.1093/sysbio/42.2.126

  • Takazawa Y., Takeda A., Hayamizu M., Gascuel O. 2026. Outperforming the majority-rule consensus tree using fine-grained dissimilarity measures. bioRxiv. https://doi.org/10.64898/2026.03.16.712085

Please note that the 'Quartet' project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

About

R package to calculate the similarity of two trees based on the number of shared four-taxon subtrees (or splits)

Topics

Resources

Code of conduct

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors