Please use this identifier to cite or link to this item:
http://dspace.cityu.edu.hk/handle/2031/9223
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Hang, Ching Nam | en_US |
dc.date.accessioned | 2020-01-16T02:30:53Z | - |
dc.date.available | 2020-01-16T02:30:53Z | - |
dc.date.issued | 2019 | en_US |
dc.identifier.other | 2019cshcn179 | en_US |
dc.identifier.uri | http://dspace.cityu.edu.hk/handle/2031/9223 | - |
dc.description.abstract | In this era of big data, data size grows in an unpredictable rate with the growth of Internet of Things (IoT) in recent years. Data becomes more difficult to manage and data leakage turns into common risks and mistakes. Thus, more companies are more aware of the necessity for cyber security every day. To detect suspicious activities among data, one of the most common strategies is to identify internet communities. In graph theory, a clique is a meaningful community structure and it is a fundamental concept of graph constructions. In this project, we introduce a novel joint hierarchical clustering and parallel counting algorithm that can carry out high performance large-scale data-intensive computing to count the number of cliques with different sizes in large graphs. Since clique decision problem is an NP-complete problem, we initially design the algorithm based on computing the exact number of triangles, then extend and modify it to achieve our ultimate goal. The algorithm consists of three major steps, pruning, hierarchical clustering and parallel counting. It allows scalable software framework, MapReduce, to calculate the number of cliques inside each cluster as well as those straddling between clusters in parallel. We characterize the performance of the algorithm mathematically, and evaluate its performance using different representative graphs including random graphs and social networks to demonstrate its computational efficiency over other state-of-the-art techniques. | en_US |
dc.rights | This work is protected by copyright. Reproduction or distribution of the work in any format is prohibited without written permission of the copyright owner. | en_US |
dc.rights | Access is restricted to CityU users. | en_US |
dc.title | Large Graph Mining: Subgraph Isomorphism | en_US |
dc.contributor.department | Department of Computer Science | en_US |
dc.description.supervisor | Supervisor: Dr. Tan, Chee Wei; First Reader: Dr. Song, Linqi; Second Reader: Dr. Wong, Hau San Raymond | en_US |
Appears in Collections: | Computer Science - Undergraduate Final Year Projects |
Files in This Item:
File | Size | Format | |
---|---|---|---|
fulltext.html | 148 B | HTML | View/Open |
Items in Digital CityU Collections are protected by copyright, with all rights reserved, unless otherwise indicated.