Dissertation
Improving protein interactions prediction using machine learning and visual analytics
Washington State University
Doctor of Philosophy (PhD), Washington State University
12/2007
DOI:
https://doi.org/10.7273/000005769
Abstract
The response of biological systems to external stimuli is ruled by their cellular
interaction networks. This makes the problem of inferring cellular interaction networks
essential to decipher the basic operational principles of biological systems. Knowing
which proteins exist in a certain organism or cell type and how these proteins interact
with each other are necessary for the understanding of biological processes at the whole
cell level. The determination of the protein-protein interaction (PPI) networks has been
the subject of extensive research and it has been shown that domain-domain interactions
(DDIs) are good indicators of possible protein interactions, and can more accurately
predict protein interactions than comparing full-length protein sequences. Despite the
development of reasonably successful methods there is definite scope for improvement.
This thesis is aimed at developing machine learning based computational
techniques that utilize domain information in the proteins to predict PPI networks. This
research aims to make four major contributions to the field of PPIs. The first two are the
development of two new PPI prediction algorithms, DomainGA and DomainSVM.
DomainGA is a genetic algorithm based multi-parameter optimization method which
quantifies DDIs and uses them to predict PPI. The second method, DomainSVM utilizes
the DDI scores obtained from DomainGA in a Support Vector Machine (SVM) based
learning system to improve PPIs prediction by overcoming the limitations of DomainGA.
These two methods can be used as a two-step filtering process to validate experimentally
detected PPI. The third contribution is score assignment to DDIs which is proven to be
discriminatory between positive and negative PPI. Finally the fourth contribution is a
visual analytic environment called CABIN (Collective Analysis of Biological Interaction
Networks) which provides a one-of-its-kind tool to analyze, compare and integrate
multiple predicted networks obtained from public data sources and/or inference
algorithms such as DomainGA and DomainSVM. The predicted interactions
accompanied by a confidence score and an exploratory visualization environment shall
help researchers validate experimental observations and/or make an informed decision
while generating hypothesis and models for designing new experiments.
Metrics
3 File views/ downloads
22 Record Views
Details
- Title
- Improving protein interactions prediction using machine learning and visual analytics
- Creators
- Mudita Singhal
- Contributors
- John H Miller (Chair) - Washington State University, School of Engineering and Applied Sciences (TRIC)
- Awarding Institution
- Washington State University
- Academic Unit
- School of Electrical Engineering and Computer Science
- Theses and Dissertations
- Doctor of Philosophy (PhD), Washington State University
- Publisher
- Washington State University
- Number of pages
- 120
- Identifiers
- 99901054940801842
- Language
- English
- Resource Type
- Dissertation