Dissertation
Constraint-based Refinement for Attributed Graphs
Doctor of Philosophy (PhD), Washington State University
01/2020
Handle:
https://hdl.handle.net/2376/111416
Abstract
Graph data quality is an emerging issue and is critical to graph analytics. Two graph quality issues are incompleteness (with missing links) and inconsistency (with conflicted attribute values). To mitigate these issues, link prediction is to infer the missing links that belong to the missing part of graphs, which improves the graph completeness; entity repair aims to detect the erroneous attribute values of nodes and change them to the correct values, which improves the graph consistency. This thesis proposes novel constraints with graph patterns incorporated and effective refinement algorithms for link prediction and entity repair in graphs, and further investigates a new problem of interactive graph refinement between multiple types of constraints with user feedback, such that it holistically enriches graph quality. (1) For link prediction, this thesis proposes a class of new data constraints namely GFCs, and develops efficient GFC discovery algorithms with performance guarantees and effective link prediction algorithms based on GFCs. Moreover, it extends GFCs with ontology, to further improve link prediction accuracy without sacrificing much efficiency. (2) For entity repair, novel constraints namely StarFDs are proposed to capture inconsistencies in graphs, by specifying graph functional dependencies with star structures. StarFDs enjoy efficient algorithms for error detection in attributed graphs. The repair problem aims to compute a graph that satisfies a set of StarFDs with minimum editing cost of attribute values, which is NP-hard and APX-hard. Despite the hardness, a dichotomous framework is proposed for the three cases of this problem that admits optimal, approximable, and bounded-cost solutions, all with performance guarantees. In addition, fundamental problems have been studied for StarFDs, i.e., satisfiability, validation, and implication. (3) An interactive graph refinement framework has been investigated, which integrates multiple constraints with user feedback holistically, to further enrich graph quality. This thesis proposes the system implementation for a user-centric graph refinement tool, which could impact the next generation of graph data management systems. It also shows that the proposed constraints can effectively benefit domain specific applications.
Metrics
Details
- Title
- Constraint-based Refinement for Attributed Graphs
- Creators
- Peng Lin
- Contributors
- Yinghui Wu (Advisor)Lawrence Holder (Committee Member)Assefaw Gebremedhin (Committee Member)
- Awarding Institution
- Washington State University
- Academic Unit
- Electrical Engineering and Computer Science, School of
- Theses and Dissertations
- Doctor of Philosophy (PhD), Washington State University
- Number of pages
- 200
- Identifiers
- 99900581808101842
- Language
- English
- Resource Type
- Dissertation