Your browser does not support JavaScript!

Home    Search  

Results - Details

Search command : All Fields="Τσέλας"

Current Record: 1 of 1

Back to Results Previous page
Next page
Add to Basket
[Add to Basket]
Identifier 000412226
Title Latent feature construction for gene expressions improves predictions
Alternative Title Η κατασκευή κρυμμένων χαρακτηριστικών για γονιδιακές εκφράσεις βελτιώνει την προβλεπτική ικανότητα
Author Τσέλας, Χρήστος Ρ.
Thesis advisor Τσαμαρδινός, Ιωάννης
Reviewer Τζιρίτας, Γεώργιος
Στυλιανού, Ιωάννης
Abstract Gene expression analysis aims to improve the understanding of the intrinsic cellular processes and contribute towards the successful implementation of personalized medicine. The advent of high-throughput gene expression technologies such as microarrays and RNA-sequencing (RNAseq) as well as the recent reduction of cost resulted in an explosion of publicly-available datasets. The generated datasets are inevitably high-dimensional with typically small sample size that severely limits the potential for developing reproducible prognostic models. Being able to increase the predictive power without losing the information of the measured genome on a newly-produced dataset is of paramount importance. Despite the fact that various studies attempt to perform dimensionality reduction and dataset integration so as to increase classification performance and robustness, there are still challenging issues primarily due to the limited number of data as well as the technological diversity and heterogeneity across the datasets. Exploiting the redundancy of genomics data, we constructed low-dimensional, universal, latent feature spaces of the genome utilizing several dimensionality reduction approaches and a diverse set of curated datasets. Standard Principal Component Analysis (PCA), kernel PCA and Neural Network Autoencoders were applied on datasets from four different platforms. While linear techniques showed better reconstruction performance, nonlinear approaches were able to capture more complex gene interactions, and thus enjoyed stronger classification power. When newly-seen gene expression datasets projected to a latent space of 200 dimensions, the classification power was improved. Moreover, we performed a large-scale experiment where the dimensionality reduction methods were trained on an integrated set of 59864 unique samples. The classification power was further improved especially for Autoencoder. Rather surprisingly, the statistical variability of the additional datasets increased the classification performance implying that intricate biological features were better learn. We additionally tested the possibility of cross-platform data augmentation by constructing an intermediate feature space showing that when platforms share common characteristics (such as GLP570 and GLP96) the predictive performance was also improved.
Language English
Subject Dimensional reduction
Gene expression
Machine learning
Γονιδιακή έκφραση
Μηχανική μάθηση
Συμπίεση διαστάσεων
Issue date 2017-11-24
Collection   Faculty/Department--Faculty of Sciences and Engineering--Department of Computer Science--Post-graduate theses
  Type of Work--Post-graduate theses
Permanent Link Bookmark and Share
Views 139

Digital Documents
No preview available

View document
Views : 10