Please use this identifier to cite or link to this item:
http://dspace.cityu.edu.hk/handle/2031/8216
Title: | Research Studies on Possible Improvements to the Aho-Corasick String Matching Algorithm |
Authors: | Xu, Yisi (許逸思) |
Department: | Department of Electronic Engineering |
Issue Date: | 2015 |
Course: | EE4382 Project |
Programme: | Bachelor of Engineering (Honours) in Information Engineering |
Supervisor: | Supervisor: Dr. PAO, Derek C W; Assessor: Dr. SO, H C |
Subjects: | Matching theory. Computer algorithms. |
Description: | Nominated as OAPS (Outstanding Academic Papers by Students) paper by Department in 2015-16. Conference paper developed from this OAPS paper: Xu, Y., & Pao, D. (2015). Space-time tradeoff in the Aho-Corasick string matching algorithm. In 2015 IEEE Conference on Communications and Network Security (pp. 713-714). IEEE. doi: 10.1109/CNS.2015.7346899. |
Citation: | Xu, Y. (2015). Research studies on possible improvements to the Aho-Corasick string matching algorithm (Outstanding Academic Papers by Students (OAPS)). Retrieved from City University of Hong Kong, CityU Institutional Repository. |
Abstract: | Four data structures (AC-basic, AC-expanded, AC-bitVec and AC-compressed) are implemented in this project in order to realize Aho-Corasick (AC) algorithm. Statistics on space and time requirements of these four implementations on scanning various input files are collected and compared. In AC-expanded, a two-dimensional array is used to store the fully expanded transition rule table, which takes up the most memory space among the four. AC-basic uses linked-lists to save memory space, but the processing speed is lowered by this practice. AC-bitVec transforms the linked-lists to bit vectors to lower the time and memory requirement. AC-compressed is a reasonable improvement over the above three, in which the transition rule table is compressed by perfect hashing and elimination of transition edges. Comparing the results obtained by running these four versions, the processing speed of AC-compressed reaches 3.9 to 7.8 times that of AC-basic, while it is 34% to 83% of that of AC-expanded. Regarding memory space requirement, AC-compressed needs only 3.8% to 6.3% the space of AC-expanded, and 1.7 to 2.6 times of that of AC-basic. Thus, the proposed AC-compressed implementation can be served as a satisfying trade-off between memory and speed over the basic version and expanded version. |
Appears in Collections: | Electrical Engineering - Undergraduate Final Year Projects OAPS - Dept. of Electrical Engineering |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
fulltext.html | 145 B | HTML | View/Open | |
conference_paper.html | 126 B | HTML | View/Open | |
authorpage-Xu_Yisi.htm | 159 B | HTML | View/Open |
Items in Digital CityU Collections are protected by copyright, with all rights reserved, unless otherwise indicated.