Dissertation
Improving protein remote homology detection using supervised and semi-supervised support vector machines
Washington State University
Doctor of Philosophy (PhD), Washington State University
05/2008
DOI:
https://doi.org/10.7273/000005811
Abstract
The understanding of protein functions and there-by characterization is essential to modeling complex biological systems; for example, developing drugs against a pathogen requires understanding the proteins whose function is relevant to virulent activity. To a degree, protein annotation is helped by the fact that protein sequences are evolutionarily related or homologous. Knowing that a protein sequence is homologous to an already characterized sequence, the properties of the latter can be propagated to the former. The task of detection of protein sequences that are similar (homologous) to a given query protein is termed homology detection. Computational homology detection is one of the oldest tasks in bioinformatics. Conventional methods of homology detection rely on the basic principle of sequence similarity. However these methods fail to identify distantly-related homologs that have highly diverged protein sequences. This thesis aims at developing machine learning based models of groups of protein sequences to improve the sensitivity of remote homolog detection. This research makes two significant contributions to the field of remote homology detection by the development of additional algorithms. The first contribution is the algorithm SVM-SimLOC, which utilizes sub-cellular localization information of protein sequences. The second algorithm, SVM-HUSTLE, is the first of its kind semi-supervised method using support vector machines for remote homology detection. Lastly, we propose a systems biology-based solution to practical homology detection using a publicly available tool called the Bioinformatics Resource Manager.
Metrics
4 File views/ downloads
33 Record Views
Details
- Title
- Improving protein remote homology detection using supervised and semi-supervised support vector machines
- Creators
- Anuj R. Shah
- Contributors
- John H. Miller (Chair)
- Awarding Institution
- Washington State University
- Academic Unit
- School of Electrical Engineering and Computer Science
- Theses and Dissertations
- Doctor of Philosophy (PhD), Washington State University
- Publisher
- Washington State University
- Number of pages
- 109
- Identifiers
- 99901055035101842
- Language
- English
- Resource Type
- Dissertation