Abstract |
In recent years, the field of metabolomics has contributed to various scientific fields,
such as medicine, microbiology, biotechnology, toxicology and others. Through
metabolomics studies, researchers investigate changes that occur in biological systems,
usually caused by genetic factors, environmental influences, or even diseases.
Metabolomics allows us to examine metabolic pathways, interactions between
metabolites, and the roles they play in the proper or improper functioning of an
organism. By using advanced analytical technologies, such as mass spectrometry,
researchers can uncover new information about the functioning of various biological
systems.
To explore and evaluate metabolomic research data, we need software that facilitate the
detection, quantification, and alignment of various peaks in complex systems.
Additionally, the information extracted by such software allows us to detect unique
characteristics and discover associations between metabolites and physiological or
pathological conditions. In this thesis, three software were evaluated, namely MS-Dial,
XCMS, and Agilent Profinder.
The purpose of this study is to compare the aforementioned software by evaluating the
performance of each in the analysis of untargeted metabolomic data. Through this
process, valuable insights were sought to determine the differences of these software
for similar types of metabolomic studies.
In this study, we used data deriving from umbilical cord blood samples from 500
neonates, sourced from four hospital environments in four different countries. These
samples were analyzed using high-performance liquid chromatography coupled with
time-of-flight mass spectrometry (TOF-MS).
The statistical analysis was conducted in two parts. In the first part, we attempted to
reproduce the parameters of the original article from which the data originated with a
different pre-processing software. Furthermore, using Spearman and Pearson
correlations we compared our results with both the original and an article that had
related observations on the same sample set. However, due to the lack of detailed
anthropometric information required for the adjustments of the epidemiological
models, we could not draw definitive conclusions from the first part of the analysis.
Therefore, we proceeded to the second part, focusing on applying identical parameters
among the three software programs, and comparing the number of obtained features.
Applying unpaired t-test, we detected the statistically significant features as resulted
per software and we evaluated the similarities and differences of the results, as well as
their overall performance in the analysis of the same set of metabolomics data.
|