Show simple item record

Machine learning and statistical analysis to identify determinants of survival in primary glioblastoma using genome wide expression studies

dc.contributor.advisorBeißbarth, Tim Prof. Dr.
dc.contributor.authorKalya Purushothama, Manasa
dc.titleMachine learning and statistical analysis to identify determinants of survival in primary glioblastoma using genome wide expression studiesde
dc.contributor.refereeWaack, Stephan Prof. Dr.
dc.description.abstractengGlioblastoma (GBM) is the most common and highly aggressive brain tumor. GBM poses unique challenges due to high rates of tumor recurrence ( 34%) and resistance to treatment. Despite advances in treatment strategies, the median survival of glioblastoma patients has not improved beyond 12 - 15months. Patients who survive less than 12 months are considered as Short-Term Survivors (STS). Despite all the challenges associated with the disease, there are a small group of patients who survive longer than 3 years and are termed as Long-Term Survivors (LTS). Researchers in the area continue to be perplexed by this group of patients since investigations on clinical, radiological, histological, and genetic features have failed to provide consensus on predictors of long-term response to current treatment.The goal of this study is to identify crucial survival factors in GBM and to elucidate the molecular processes that drive poor survival using gene expression profiles. To achieve this, I have used different computational approaches which are discussed in detail in the four chapters of my thesis. The time-to-event analysis helps to investigate the impact of any clinical or molecular factor on survival and is hence also called survival Analysis. Survival analysis is performed on 2309 GBM patients aggregated from 14 publicly available datasets in Chapter 1 to see impact of various clinical and molecular factors like Age, Gender, Karnofsky Performance Score, IDH mutation status, and MGMT promoter methylation status on survival. A meta- analytic approach is considered to integrate the observations using the random effects model. Age and MGMT promoter methylation status were found to be factors of prognostic impor- tance. This work also signifies the importance of Age as well as quantifies the risk of death that a patient experience based on his age group. Example,patients aged beyond 70 years were found to experience 2.4 times higher risk of death/ poor survival than younger patients (40 - 50 yrs). In the next approach,I [Kalya et al., 2021a] investigated gene-expression signatures that had an impact on survival Chapter 2. 720 genes were found to impact survival according to univariate cox regression and are reported. The enrichment analysis has revealed beta- catenin network, hypoxia pathway, IL-6 signaling pathways, cell-cell communication, Inter- feron signaling pathways which are known for their diverse role in GBM biology. Gene- regulatory networks were built using state-of-art promoter analysis and pathway analysis us- ing the GenomeEnhancer pipeline. This analysis revealed 43 master regulators which regu- late the signal transduction networks driving prognosis in glioblastoma. Upon the informa- tion available in the HumanPSDTM database, we find that some of these targets have action- able drugs reported at multiple stages of clinical trials. To investigate the molecular mechanism underpinning poor prognosis and short survival in Glioblastoma, gene-regulatory networks were built on the genes differentially upregulated (LogFC > 0.5) in short-survivors of glioblastoma Chapter 3 [Kalya et al., 2021b]. Regula- tory networks are built on the network feedback loops of Walking pathways described ear- lier. 12 Transcription Factors including NANOG, PPARG, FRA-1 and others were found enriched in promoters of these dysregulated genes. Graph analysis of the signal transduc- tion network upstream of these transcription factors revealed five potential master regulators that could explain gene dysregulation in short-survivors: insulin-like growth factor binding protein (IGFBP2), vascular endothelial growth factor A (VEGF-A), its isoform VEGF165, platelet-derived growth factor A (PDGFA), oncostatin M (OSMR), and adipocyte enhancer- binding protein (AEBP1). All of the identified master regulators were elevated in STS, and their expression patterns were computationally verified in two additional independent co- horts. This work proposes a novel mechanism of gene dysregulation by IGFBP2 by modu- lating a key molecule of tumor invasiveness and progression - FRA-1 transcription factor. Machine Learning has now become an indispensable tool in GBM research. Chapter 4 [Kalya et al., 2022] evaluates application of 10 ML models to build a classifier which can classify GBM patients into short-term and long-term survivor groups based on their tran- scriptomic profiles and clinical information (age). A random forest model with an F1 score of 86.4% (Accuracy = 80%, AUC =74% ) is proposed (with good external validity). This classification model is deployed as a webtool. The important features are discussed for their biological relevance in the disease using gene ontology analysis, survival analysis, differential expression and by mapping them onto existing databases of gene-disease biomarkers associa- tions. Using this approach, we have identified 199 mRNAexpression based biomarkers that are associated with survival group prediction. Of them, 171 are not reported to be biomark- ers of glioblastoma in existing databases. In conclusion, this work evaluates clinical factors associated with prognosis using a meta- analytic approach. This work proposes 242 gene-expression based biomarkers associated with GBM survival using different computational approaches. Molecules like PDGFA, AEBP1 and VEGF were found to be important in both gene-regulatory network analysis and in Ma- chine Learning models. A novel mechanism of gene dysregulation in short-term survivors via FRA-1 transcription, a key molecule of tumor invasiveness and progression is proposed. The thesis encompasses 2 published research papers and a ready to submit version of the research
dc.contributor.coRefereeWingender, Edgar Prof. Dr.
dc.subject.engMaster regulatorde
dc.subject.enggene regulatory networkde
dc.subject.engMachine learningde
dc.affiliation.instituteFakultät für Mathematik und Informatikde
dc.subject.gokfullInformatik (PPN619939052)de
dc.notes.confirmationsentConfirmation sent 2022-07-05T11:15:02de

Files in this item


This item appears in the following Collection(s)

Show simple item record