Your browser does not support JavaScript!

Home    Microbial communities through the lens of high throughput sequencing, data integration and metabolic networks analysis  

Results - Details

Add to Basket
[Add to Basket]
Identifier 000450543
Title Microbial communities through the lens of high throughput sequencing, data integration and metabolic networks analysis
Alternative Title Μικροβιακές κοινότητες υπό το πρίσμα αλληλουχίας υψηλής απόδοσης, ολοκλήρωσης δεδομένων και ανάλυση μεταβολικών δικτύων
Author Ζαφειρόπουλος, Χαράλαμπος
Thesis advisor Λαδουκάκης, Εμμανουήλ
Reviewer Παφίλης, Ευάγγελος
Νικολάου, Χριστόφορος
Λύκα, Κωνσταντία
Σαρρής, Παναγιώτης
Carlsson, Jens
Faust, Karoline
Abstract Microbial communities are a cornerstone for most ecosystem types. To elucidate the mechanisms governing such assemblages, it is fundamental to identify the taxa present (who) and the processes that occur (what) in the various environments (where). Thanks to a series of technological breakthroughs vast amounts of information/data from all the various levels of the biological organization have been accumulated over the last decades. In this context, microbial ecology studies are now relying on bioinformatics methods and analyses. Therefore, a great number of challenges both from the biologistand the computer scientist point-of-view have arisen; one among the most emerging ones being: "what shall we do with all these pieces of information?". The paradigm of Systems Biology addresses this challenge bymoving from reductionism tomore holistic approaches attempting to interpret how the properties of a system emerge. Aim of this PhD was to enhance microbiome data analyses by developing software addressing on-going computational challenges on the study of microbial communities. On top of that, to exploit such state-of-the-art methods to study microbial assemblages in extreme environments. To this end, the Tristomo marsh in Karpathos island (Greece), was chosen as a study case. Environmental DNA andmetabarcoding have been widely used to estimate the biodiversity (the who) and the structure of communities. Vast amount of sequencing data targeting certain marker genes depending the taxonomic group of interest become available thanks to High Throughput Sequencing technologies. However, the bioinformatics analysis of such data require multiple steps and parameter settings as well as increase computing resources. Workflows along with computing infrastructures ease this need to a great extent; in this nontion, a Pipeline for environmental DNAMetabarcoding Analysis (PEMA) was developed (Chapter 2.1). However, eDNA metabarcoding has limitations too. Cytochrome c oxidase subunit I (COI) marker gene is a commonly used marker gene, especially in studies targeting eukaryotic taxa. It is well known that in COI studies a great number of the derived Operational Taxonomic Unitss (OTUs) get no taxonomic hits. The presence of pseudogenes but also of non-eukaryotic taxa among the amplicon data, with the simultaneous absence of the latter fromthe most commonly-used reference databases justify this phenomenon to a great extent. To identify such cases the Dark mAtteR iNvestigator (DARN) software was developed; DARN makes use of a COI-oriented tree of life to provide further insight to such known unknown sequences (Chapter 2.2). Amplicon and shotgun metagenomics approaches along with the rest of the omics technologies, have led to vast amount of data and metadata, recording the who, the what and the where. To enable optimal accessibility and usage of this information, a great number of databases, ontologies as well as community-standards have been developed. By exploiting data integration techniques to bring such bits of information together, as well as text mining methods to retrieve knowledge "hidden" among the billions of text lines in already published literature, the PREGO knowledge-base returns thousands of what - where - who potential associations (Chapter 3). The driving question though is how the different microbial taxa ascertain their endurance as part of a community. Metabolic interactions among the various taxa play a decisive role for the composition of such assemblages. Genome-scale metabolic networks (GEMs) enable the inference of such interactions. Random sampling on the flux space of such metabolicmodels, provides a representation of the flux values a model can get under various conditions. However, flux sampling is challenging from a computational point of view, especially as the dimension of a metabolic model increases. To address such challenges, a Python library called dingo was developed using aMultiphaseMonte Carlo Sampling algorithm (Chapter 4). Finally, sediment andmicrobial mat samples as well as microbial aggregates from a hypersalinemarsh in Tristomo bay (Karpathos, Greece) were analyzed. Both amplicon (16S rRNA) and shotgun sequencing datawere used to characterize the microbial structure of the communities and environmental parameters (e.g. salinity, oxygen concentration) were measured at the sampling sites. Key functions supporting life in such environments were identified and metagenome-assembled genomes (MAGs) of novel species found were built (Chapter 5). Similar to microbial communities, bioinformatics methods tend to build assemblages while "living" on your own is quite rare. The methods developed during this PhD project combined with state-of-the-art methods anticipate to build a framework that enables moving from the community to the species level and then back again to the one of the community. Such a framework is described for the study of microbial interactions at real-world communities.
Language English
Subject Metabarcoding
Metabolic modeling
Metagenome
Microbiome
Μεταβολική μοντελοποίση
Μεταγονιδιωματική
Μετακωδικοποίηση
Μικροβίωμα
Issue date 2022-10-07
Collection   School/Department--School of Sciences and Engineering--Department of Biology--Doctoral theses
  Type of Work--Doctoral theses
Views 323

Digital Documents
No preview available

Download document
View document
Views : 5