Thesis
Frequent subgraph mining on a single large graph using sampling techniques
Washington State University
Master of Science (MS), Washington State University
2010
Handle:
https://hdl.handle.net/2376/102545
Abstract
Frequent subgraph mining is an important task in data mining since we can extract useful information from a graph by identifying subgraphs that appear frequently in the graph. There are two kinds of frequent subgraph mining tasks based on the input type of the graph data. One is frequent subgraph mining on graph transactions, in which the input data is a set of relatively small graphs. The second is frequent subgraph mining on a single graph, in which the input data is a single relatively large graph. This thesis investigates the task of frequent subgraph mining on both graph transactions and on a single large graph. Frequent subgraph mining on a single graph is more general because the algorithms for frequent subgraph mining on a single graph can be used for frequent subgraph mining on graph transactions. Therefore, this thesis will focus more on addressing frequent subgraph mining on a single graph. One challenge for current frequent subgraph mining tasks is that the graph datasets are becoming increasingly large. Examples include some social network datasets, such as Facebook and MySpace. When the graph is too large to fit in main memory, alternative techniques are necessary to efficiently find frequent subgraphs. We investigate the task of frequent subgraph mining on a single large graph using sampling approaches and find that sampling is a feasible approach for this task. This thesis evaluates different sampling methods and also provides a novel sampling method called „random areas selection sampling, which produces better results than all the current graph sampling approaches. The methods are evaluated on several large real-world graph datasets.
Metrics
20 File views/ downloads
8 Record Views
Details
- Title
- Frequent subgraph mining on a single large graph using sampling techniques
- Creators
- Ruoyu Zou
- Contributors
- Lawrence B. Holder (Degree Supervisor)
- Awarding Institution
- Washington State University
- Academic Unit
- Electrical Engineering and Computer Science, School of
- Theses and Dissertations
- Master of Science (MS), Washington State University
- Publisher
- Washington State University; Pullman, Wash. :
- Identifiers
- 99900525023701842
- Language
- English
- Resource Type
- Thesis