Dissertation
Topological Data Analysis for Computational Phenomics: Algorithms and Applications
Doctor of Philosophy (PhD), Washington State University
01/2020
Handle:
https://hdl.handle.net/2376/116839
Abstract
Phenomics is an emerging branch of modern biology that uses high throughput phenotyping tools to capture multiple environmental and phenotypic traits, often at massive spatial and temporal scales. The resulting high dimensional data represents a treasure trove of information for providing an in-depth understanding of how multiple factors interact. However, computational tools that can parse such complex data and aid in extracting plausible hypotheses are currently lacking. We present a new algorithmic approach to visually explore complex phenomics data and to understand the role of environment on phenotypic traits. We model the problem as one of unsupervised structure discovery. Given a high dimensional point cloud of phenomics data with functions defined on the points, the mapper algorithm (derived from algebraic topology) produces a compact summary in the form of a simplicial complex whose 1-skeleton forms a weighted directed acyclic graph G = (V, E).
First, we provide an algorithmic framework to construct mapper objects from largescale phenomics data sets. Secondly, we present algorithms to extract structurally interesting
features from such topological objects. These features include:
a) Flares: We study interesting flares in G, which are structural branching features in data that characterize divergent behavior of subpopulations, in an unsupervised manner. We present algorithms to detect and rank flares.
b) Paths: We study interesting paths in G. We rank paths using their corresponding interestingness scores. Interestingness score of a path is defined as the sum of its edge weights multiplied by a nonlinear function of their corresponding ranks, i.e., the depths of the edges along the path.
Finally, we present an open source software implementation called Hyppo-X, which has interactive visualization capabilities to facilitate data navigation and hypothesis formulation. We test and evaluate Hyppo-X on three real-world plant (maize) data sets. Our results demonstrate the ability of our framework to delineate divergent subpopulation-level behavior.
To the best of our knowledge, the above set of contributions represent the first effort in using topological data analysis to the field of computational phenomics. We believe future pipelines in phenomics data analysis will benefit immensely from the structure discovery and visual analytic capabilities of TDA.
Metrics
52 File views/ downloads
29 Record Views
Details
- Title
- Topological Data Analysis for Computational Phenomics: Algorithms and Applications
- Creators
- Md Kamruzzaman
- Contributors
- Ananth Kalyanaraman (Advisor)Ananth Kalyanaraman (Committee Member)Bala Krishnamoorthy (Advisor)Bala Krishnamoorthy (Committee Member)Janardhan Rao Doppa (Committee Member)Haipeng Cai (Committee Member)
- Awarding Institution
- Washington State University
- Academic Unit
- Electrical Engineering and Computer Science, School of
- Theses and Dissertations
- Doctor of Philosophy (PhD), Washington State University
- Number of pages
- 129
- Identifiers
- 99900581610601842
- Language
- English
- Resource Type
- Dissertation