A study on fast plagiarism checking algorithm-I

Please use this identifier to cite or link to this item: http://dspace.cityu.edu.hk/handle/2031/6667

Title:	A study on fast plagiarism checking algorithm-I
Authors:	Li, Shiyin
Department:	Department of Electronic Engineering
Issue Date:	2012
Supervisor:	Supervisor: Dr. Pao, Derek C W; Assessor: Dr. Po, L M
Abstract:	Plagiarism detection system is well-known in universities for years, and it usually takes several hours to process each paper. In this project, I apply a new algorithm to plagiarism detection software to reduce checking time. Before compare paper against a huge data pool, a series of pre-processing is performed. In the pool, for each article, I divide it into segments, each containing a few words and I calculate numerical identifier for each segment, and then store all segments into hash table with their unique identifier as hash key. The new hash table structure has worst case constant lookup time, and space usage which is similar to binary search trees. During plagiarism detection, I use identifier of segment for comparison instead of character string. In general, the algorithm avoids direct character comparison in most cases and provides liner time complexity when do plagiarism checking.
Appears in Collections:	Electrical Engineering - Undergraduate Final Year Projects

Files in This Item:

File	Size	Format
fulltext.html	145 B	HTML	View/Open