Your browser does not support JavaScript!

Post-graduate theses

Search command : Author="Στεφανίδης"  And Author="Κωνσταντίνος"

Current Record: 6 of 723

Back to Results Previous page
Next page
Add to Basket
[Add to Basket]
Identifier 000434163
Title Learning biologically interpretablelLatent representations from gene expression data
Alternative Title Μαθαίνοντας βιολογικά ερμηνεύσιμες κρυφές αναπαραστάσεις από δεδομένα γονιδιακών εκφράσεων
Author Καραγιαννάκη, Ιουλία Ε.
Thesis advisor Τσαμαρδινός, Ιωάννης
Reviewer Τζιρίτας, Γεώργιος
Πανταζής, Γιάννης
Abstract Gene expression data are typically high dimensional with low sample size. This leads to several statistical and analytical challenges that one needs to overcome in order to analyze and infer the underlying biological mechanisms of such data. To this end, several dimensionality reduction techniques have been proposed. Dimensionality reduction techniques learn a lower dimensional space (latent space), of newly constructed features and represent the data as a sum of those (latent representations). The projection of the data to the latent feature space compresses the data, retains the significant information and reduces noise. Typical dimensionality reduction techniques, such as Principal Component Analysis, derive latent representations that are uninterpretable biologically. In order to regain a degree of interpretability, other methods return sparse latent representations. Particularly, the new features are constructed as linear combinations of only a few of the molecular quantities. However, sparse latent representations are still hard to interpret biologically as they do not directly correspond to the known biological pathways or other known genesets. In this thesis, we present a novel algorithm for feature construction and dimensionality reduction called Pathway Activity Score Learning (PASL). The major novelty of PASL is that the constructed features are constrained to directly correspond to known molecular pathways and can be interpreted as pathway activity scores. PASL is evaluated both on simulated and real data. We show that PASL retains the predictive information for disease classification on new, unseen datasets. We also show that differential activation analysis provides complementary information to standard geneset enrichment analysis.
Language English
Subject Dimensionality reduction
Disease classification
Κατηγοριοποίηση ασθενειών
Μείωση διαστάσεων
Issue date 2020-11-27
Collection   Faculty/Department--Faculty of Sciences and Engineering--Department of Computer Science--Post-graduate theses
  Type of Work
Permanent Link Bookmark and Share
Views 1

Digital Documents
No preview available

View document
Views : 1