• Deutsch
    • English
  • Deutsch 
    • Deutsch
    • English
  • Einloggen
Dokumentanzeige 
  •   Startseite
  • Naturwissenschaften, Mathematik und Informatik
  • Fakultät für Mathematik und Informatik (inkl. GAUSS)
  • Dokumentanzeige
  •   Startseite
  • Naturwissenschaften, Mathematik und Informatik
  • Fakultät für Mathematik und Informatik (inkl. GAUSS)
  • Dokumentanzeige
JavaScript is disabled for your browser. Some features of this site may not work without it.

Fast methods for metagenomic sequence search and annotation

von Milot Mirdita
Kumulative Dissertation
Datum der mündl. Prüfung:2022-02-21
Erschienen:2022-06-16
Betreuer:Dr. Johannes Söding
Gutachter:Dr. Johannes Söding
Gutachter:Prof. Dr. Stephan Waack
Gutachter:Prof. Dr. Antonio Fernandez-Guerra
crossref-logoZum Verlinken/Zitieren: http://dx.doi.org/10.53846/goediss-9294

 

 

Dateien

Name:thesis-final-without-cv.pdf
Size:38.9Mb
Format:PDF
ViewOpen

Lizenzbestimmungen:


Zusammenfassung

Englisch

The past two decades have seen the development of metagenomics, the study of genes and genomes of multiple organisms simultaneously. In contrast to traditional genomic techniques, which require isolating and growing individual organisms in the lab, in metagenomics, samples are directly taken from the environment, sequenced and then analyzed in silico. Modern sequencing techniques have enabled high throughput read-out of DNA and RNA of microorganism communities in marine, soil, gut and many other environments. The plethora of data generated using these techniques poses a major challenge for existing computational techniques. This burden translates directly to computational run times and the cost of resources required to carry out metagenomic analyses. Thus, computational methods developed for metagenomic analysis require exceptional efficiency and speed. At the same time, metagenomic studies become relevant for more and more fields of research, requiring that techniques be suited for a wide range of scientific disciplines. In this work, I present three methods I developed to address the throughput bottlenecks of data analysis in metagenomics. (1) The MMseqs2 webserver is a user-friendly extension of the popular homology search method MMseqs2 designed for non-expert bioinformaticians. I accelerated MMseqs2 to process single queries much more quickly and introduced an API to enable MMseqs2's use in web applications. (2) MMseqs2 taxonomy is a method for fast and accurate taxonomy assignment of metagenomic contigs. (3) ColabFold is a method to make the groundbreaking AlphaFold2 protein structure predictions widely accessible, accelerating its input sequence alignment generation and improving its accuracy by assembling a novel database enriched with metagenomic sequences from a multitude of datasets. These methods improve upon the state-of-the-art by introducing novel algorithms and accelerating previous ones - such that previously infeasible analyses become possible - and making our metagenomic toolbox accessible to users of a wide range of skill levels.
Keywords: Proteins; Sequence Analysis; Metagenomics; Protein Structure Prediction; Homology; Webserver
 

Statistik

Hier veröffentlichen

Blättern

Im gesamten BestandFakultäten & ProgrammeErscheinungsdatumAutorBetreuer & GutachterBetreuerGutachterTitelTypIn dieser FakultätErscheinungsdatumAutorBetreuer & GutachterBetreuerGutachterTitelTyp

Hilfe & Info

Publizieren auf eDissPDF erstellenVertragsbedingungenHäufige Fragen

Kontakt | Impressum | Cookie-Einwilligung | Datenschutzerklärung | Barrierefreiheit
eDiss - SUB Göttingen (Zentralbibliothek)
Platz der Göttinger Sieben 1
Mo - Fr 10:00 – 12:00 h


Tel.: +49 (0)551 39-27809 (allg. Fragen)
Tel.: +49 (0)551 39-28655 (Fragen zu open access/Parallelpublikationen)
ediss_AT_sub.uni-goettingen.de
[Bitte ersetzen Sie das "_AT_" durch ein "@", wenn Sie unsere E-Mail-Adressen verwenden.]
Niedersächsische Staats- und Universitätsbibliothek | Georg-August Universität
Bereichsbibliothek Medizin (Nur für Promovierende der Medizinischen Fakultät)
Robert-Koch-Str. 40
Mon – Fri 8:00 – 24:00 h
Sat - Sun 8:00 – 22:00 h
Holidays 10:00 – 20:00 h
Tel.: +49 551 39-8395 (allg. Fragen)
Tel.: +49 (0)551 39-28655 (Fragen zu open access/Parallelpublikationen)
bbmed_AT_sub.uni-goettingen.de
[Bitte ersetzen Sie das "_AT_" durch ein "@", wenn Sie unsere E-Mail-Adressen verwenden.]