Dissertation
A hybrid deep-machine learning approach for predicting environmentally responsive transgenerational differential DNA methylated regions (epimutations) in the genome
Washington State University
Doctor of Philosophy (PhD), Washington State University
2023
DOI:
https://doi.org/10.7273/000005082
Abstract
With the progress of Machine Learning (ML) in the past few decades, ML has become a prominent solution for different applications. Representing data with the most informative set of features is essential for learning accurate models. In many datasets, this process can be labor-intensive and requires the user to have enough background knowledge about the domain to select relevant features. Deep learning (DL) as a part of machine learning has improved the predictive model performance since the early 2000s by automatically extracting, analyzing, and understanding useful information directly from the raw data. While DL allows automatic feature extraction from raw data, it requires a large amount of data and significant hyperparameter tuning. To overcome these challenges, we propose a hybrid model. In the hybrid model a DL component is used to extract features from the data and the ML component is used for the final prediction. The model uses XGBoost to deal more effectively with imbalanced data using the boosting process and reduce the need for hyper-parameter tuning. Additionally, the hybrid model allows the visualization of sequence motifs corresponding to the extracted features and the ranking of the importance of these features for the prediction task. A systematic analysis of the hybrid model is performed to identify the best settings in terms of the size of the deep network and the layer from which to extract features.
The hybrid model was developed in the context of a particular target application, i.e., the identification of regions in the genome with susceptibility to differentially DNA methylated regions (DMRs) due to exposure to environmental toxicants. Results show that the proposed hybrid model outperforms DL alone and ML alone for DMR epimutation prediction. The hybrid model is also used to identify unique exposure-specific transgenerational DMR epimutations and disease-specific DMR epimutations due to ancestral exposure to environmental toxicants. In addition to results on the target domain, the hybrid model is evaluated on several other domains, particularly on low-volume, high-dimensional datasets.
Metrics
40 File views/ downloads
41 Record Views
Details
- Title
- A hybrid deep-machine learning approach for predicting environmentally responsive transgenerational differential DNA methylated regions (epimutations) in the genome
- Creators
- Pegah Mavaie
- Contributors
- Lawrence B. Holder (Advisor)Michael K. Skinner (Advisor)Ananth Kalyanaraman (Committee Member)
- Awarding Institution
- Washington State University
- Academic Unit
- Electrical Engineering and Computer Science, School of
- Theses and Dissertations
- Doctor of Philosophy (PhD), Washington State University
- Publisher
- Washington State University
- Number of pages
- 207
- Identifiers
- 99901019633501842
- Language
- English
- Resource Type
- Dissertation