Περίληψη |
Understanding the processes underlying gene expression regulation is a frontier of
molecular biology in the twenty-first century. However, the role of several mechanisms
during the transcription of DNA, such as promoter-proximal pausing (PPP) or the
RNA polymerase elongation rate, remains challenging to study and largely unknown.
In silico simulations of these processes offer new opportunities for researchers to explore
these processes flexibly.
Thus, this research investigated the feasibility of studying transcription dynamics
using a novel in silico simulation tool, with a focus on genes transcribed by the enzyme
RNA polymerase II in humans.
To study gene expression regulation processes, an easily modifiable, in silico transcription simulation tool was created. Five components of the transcription process
were implemented in the simulation. The first component was the initiation of transcription, including the binding of the RNA pol to the promoter, the promoter escape,
and the PPP. The second component covered co-transcriptional splicing and alternative
splicing. The third component contained the recycling of RNA polymerase back into
the pool of active RNA polymerases (RNA pol pool). The fourth component addressed
lesions resolved through transcription-coupled nucleotide excision repair (TC-NER).
The final component concerned the elongation rate of RNA polymerases.
This research resulted in the development of an easily accessible and highly optimized simulation of transcription with more than 10000 lines of Python code and 500
lines of C code. It generated simulated data, which resembled the real data under similar conditions. Also, a Markov chain model of initiation was created whose estimations
were compared to the rates observed in the simulation, producing similar results. This
demonstrates an analytical mathematical relationship between the different parameters.
The findings and the methods developed by this project can be used to advance
the understanding of transcription dynamics and their relationship with other gene
regulation mechanisms.
This study was limited by the fact that most of the parameters used in the simulation
have been randomly generated since there was a lack of real data for them. Furthermore,
the models are complex with many parameters, and the available data might not be
enough to approximate them.
|