Skip navigation
Run Run Shaw Library City University of Hong KongRun Run Shaw Library

Please use this identifier to cite or link to this item: http://dspace.cityu.edu.hk/handle/2031/8216
Title: Research Studies on Possible Improvements to the Aho-Corasick String Matching Algorithm
Authors: Xu, Yisi (許逸思)
Department: Department of Electronic Engineering
Issue Date: 2015
Course: EE4382 Project
Programme: Bachelor of Engineering (Honours) in Information Engineering
Supervisor: Supervisor: Dr. PAO, Derek C W; Assessor: Dr. SO, H C
Subjects: Matching theory.
Computer algorithms.
Description: Nominated as OAPS (Outstanding Academic Papers by Students) paper by Department in 2015-16.
Conference paper developed from this OAPS paper: Xu, Y., & Pao, D. (2015). Space-time tradeoff in the Aho-Corasick string matching algorithm. In 2015 IEEE Conference on Communications and Network Security (pp. 713-714). IEEE. doi: 10.1109/CNS.2015.7346899.
Citation: Xu, Y. (2015). Research studies on possible improvements to the Aho-Corasick string matching algorithm (Outstanding Academic Papers by Students (OAPS)). Retrieved from City University of Hong Kong, CityU Institutional Repository.
Abstract: Four data structures (AC-basic, AC-expanded, AC-bitVec and AC-compressed) are implemented in this project in order to realize Aho-Corasick (AC) algorithm. Statistics on space and time requirements of these four implementations on scanning various input files are collected and compared. In AC-expanded, a two-dimensional array is used to store the fully expanded transition rule table, which takes up the most memory space among the four. AC-basic uses linked-lists to save memory space, but the processing speed is lowered by this practice. AC-bitVec transforms the linked-lists to bit vectors to lower the time and memory requirement. AC-compressed is a reasonable improvement over the above three, in which the transition rule table is compressed by perfect hashing and elimination of transition edges. Comparing the results obtained by running these four versions, the processing speed of AC-compressed reaches 3.9 to 7.8 times that of AC-basic, while it is 34% to 83% of that of AC-expanded. Regarding memory space requirement, AC-compressed needs only 3.8% to 6.3% the space of AC-expanded, and 1.7 to 2.6 times of that of AC-basic. Thus, the proposed AC-compressed implementation can be served as a satisfying trade-off between memory and speed over the basic version and expanded version.
Appears in Collections:Electrical Engineering - Undergraduate Final Year Projects 
OAPS - Dept. of Electrical Engineering 

Files in This Item:
File Description SizeFormat 
fulltext.html145 BHTMLView/Open
conference_paper.html126 BHTMLView/Open
authorpage-Xu_Yisi.htm159 BHTMLView/Open
Show full item record


Items in Digital CityU Collections are protected by copyright, with all rights reserved, unless otherwise indicated.

Send feedback to Library Systems
Privacy Policy | Copyright | Disclaimer