• Deutsch
    • English
  • English 
    • Deutsch
    • English
  • Login
Item View 
  •   Home
  • Naturwissenschaften, Mathematik und Informatik
  • Fakultät für Mathematik und Informatik (inkl. GAUSS)
  • Item View
  •   Home
  • Naturwissenschaften, Mathematik und Informatik
  • Fakultät für Mathematik und Informatik (inkl. GAUSS)
  • Item View
JavaScript is disabled for your browser. Some features of this site may not work without it.

Phylogenetic Forest Spaces

by Jonas Lueg
Doctoral thesis
Date of Examination:2022-10-04
Date of issue:2023-01-10
Advisor:Prof. Dr. Stephan F. Huckemann
Referee:Prof. Dr. Stephan F. Huckemann
Referee:Dr. Tom M. W. Nye
crossref-logoPersistent Address: http://dx.doi.org/10.53846/goediss-9649

 

 

Files in this item

Name:Dissertation_Jonas_Lueg_Revision.pdf
Size:1.07Mb
Format:PDF
Description:Revised dissertation from Jonas Lueg
ViewOpen

The following license files are associated with this item:


Abstract

English

Given a fixed set of operational taxonomic units (OTUs), when studying their evolutionary relationships among each other, a common way of depicting those is to construct a phylogenetic tree, a graph-theoretic tree with a root, that is the common ancestor (which one assumes to exist). Herein each leaf represents a unique OTU and edges have real positive lengths representing the evolutionary distance of some kind, which can be time, but also other metrics. Nowadays, those trees are generated from multiple measured sequence data such as DNA, RNA or protein sequences, and having various measurements from the same species, one usually obtains many different proposals for phylogenetic trees representing the evolutionary relationships. As one believes that there exists one true phylogenetic tree, one wishes to combine the different proposals into a single phylogenetic tree. As a statistician, the natural thing to do is to average the proposals in some sense in order to obtain a mean, that is again a phylogenetic tree, and which allows for further statistical methodology, such as confidence regions or testing. However, this requires some appropriate mathematical structure on the set of phylogenetic trees, preferable with geometry, i.e. at least a metric space. The Billera-Holmes-Vogtmann tree space (BHV space) is an example for such a structure, it is a metric space that is CAT(0) and furthermore a Riemann stratified space of type B, cf. Billera et al. (2001). This space has favorable properties like completeness and global non-positive curvature, which implies that between two points, there exists a unique geodesic connecting them. Furthermore, there is a polynomial time algorithm to compute exact geodesics in this space, cf. Owen & Provan (2011). Nonetheless, the space is artificially constructed from embedding it in a high dimensional Euclidean space and as a consequence does not behave as biological understanding would expect a metric space of phylogenetic trees to behave, as is discussed for example in Garba et al. (2018). They consider the distance between two trees with different structures in the case that the edge lengths go to infinity. The same example motivates one to actually include disconnected forests into the space. A promising construction that covers the example from above and behaves as biological intuition suggests is our recently introduced Wald Space, cf. Garba et al. (2021a). It is based on the characterization of phylogenetic trees as covariance matrices, which is a byproduct of a generalization of the popular biological substitution models that are used to calculate likelihoods for trees given genetic sequence data, so those substitution models are the backbone of phylogenetic tree estimation. In other words, the Wald Space is a space that is consistent with the tree estimation methods that are currently used, up to the generalizations that have been made. More details can be found in Garba et al. (2021a). In this work, we concentrate on the Wald Space purely from the perspective that it is a mathematical structure and thus we try to enable for a better understanding of the Wald Space. To this end, we introduce the mathematical structures required: metric spaces, Riemannian manifolds, Riemann stratified spaces, as well as, for our construction essential, various geometries on the manifold of strictly positive definite symmetric real matrices. Furthermore, we introduce various possible ways to represent the phylogenetic trees and forests that we consider. Having finished the introduction of the more general and known concepts, we briefly introduce the BHV Space. Then we define and describe the Wald Space, which is a topological stratified space, and we investigate its topological features. This part is the core of the thesis. Finally, we equip the Wald Space with a geometry that can be chosen to some degree and find that these spaces are then Riemann stratified spaces of type (A). Last but not least, we propose some numerical algorithms to calculate geodesics and distances in the Wald Space equipped with a geometry. Those are not tested in this work, but to some extent in Lueg et al. (2021).
Keywords: phylogenetics; Riemannian manifold; metric space; geometry; Fréchet mean; phylogenetic tree; stratified space
 

Statistik

Publish here

Browse

All of eDissFaculties & ProgramsIssue DateAuthorAdvisor & RefereeAdvisorRefereeTitlesTypeThis FacultyIssue DateAuthorAdvisor & RefereeAdvisorRefereeTitlesType

Help & Info

Publishing on eDissPDF GuideTerms of ContractFAQ

Contact Us | Impressum | Cookie Consents | Data Protection Information
eDiss Office - SUB Göttingen (Central Library)
Platz der Göttinger Sieben 1
Mo - Fr 10:00 – 12:00 h


Tel.: +49 (0)551 39-27809 (general inquiries)
Tel.: +49 (0)551 39-28655 (open access/parallel publications)
ediss_AT_sub.uni-goettingen.de
[Please replace "_AT_" with the "@" sign when using our email adresses.]
Göttingen State and University Library | Göttingen University
Medicine Library (Doctoral candidates of medicine only)
Robert-Koch-Str. 40
Mon – Fri 8:00 – 24:00 h
Sat - Sun 8:00 – 22:00 h
Holidays 10:00 – 20:00 h
Tel.: +49 551 39-8395 (general inquiries)
Tel.: +49 (0)551 39-28655 (open access/parallel publications)
bbmed_AT_sub.uni-goettingen.de
[Please replace "_AT_" with the "@" sign when using our email adresses.]