Skip navigation
Run Run Shaw Library City University of Hong KongRun Run Shaw Library

Please use this identifier to cite or link to this item: http://dspace.cityu.edu.hk/handle/2031/9223
Full metadata record
DC FieldValueLanguage
dc.contributor.authorHang, Ching Namen_US
dc.date.accessioned2020-01-16T02:30:53Z-
dc.date.available2020-01-16T02:30:53Z-
dc.date.issued2019en_US
dc.identifier.other2019cshcn179en_US
dc.identifier.urihttp://dspace.cityu.edu.hk/handle/2031/9223-
dc.description.abstractIn this era of big data, data size grows in an unpredictable rate with the growth of Internet of Things (IoT) in recent years. Data becomes more difficult to manage and data leakage turns into common risks and mistakes. Thus, more companies are more aware of the necessity for cyber security every day. To detect suspicious activities among data, one of the most common strategies is to identify internet communities. In graph theory, a clique is a meaningful community structure and it is a fundamental concept of graph constructions. In this project, we introduce a novel joint hierarchical clustering and parallel counting algorithm that can carry out high performance large-scale data-intensive computing to count the number of cliques with different sizes in large graphs. Since clique decision problem is an NP-complete problem, we initially design the algorithm based on computing the exact number of triangles, then extend and modify it to achieve our ultimate goal. The algorithm consists of three major steps, pruning, hierarchical clustering and parallel counting. It allows scalable software framework, MapReduce, to calculate the number of cliques inside each cluster as well as those straddling between clusters in parallel. We characterize the performance of the algorithm mathematically, and evaluate its performance using different representative graphs including random graphs and social networks to demonstrate its computational efficiency over other state-of-the-art techniques.en_US
dc.rightsThis work is protected by copyright. Reproduction or distribution of the work in any format is prohibited without written permission of the copyright owner.en_US
dc.rightsAccess is restricted to CityU users.en_US
dc.titleLarge Graph Mining: Subgraph Isomorphismen_US
dc.contributor.departmentDepartment of Computer Scienceen_US
dc.description.supervisorSupervisor: Dr. Tan, Chee Wei; First Reader: Dr. Song, Linqi; Second Reader: Dr. Wong, Hau San Raymonden_US
Appears in Collections:Computer Science - Undergraduate Final Year Projects 

Files in This Item:
File SizeFormat 
fulltext.html148 BHTMLView/Open
Show simple item record


Items in Digital CityU Collections are protected by copyright, with all rights reserved, unless otherwise indicated.

Send feedback to Library Systems
Privacy Policy | Copyright | Disclaimer