|
CityU Institutional Repository >
CityU Electronic Theses and Dissertations >
ETD - Dept. of Computer Science >
CS - Master of Philosophy >
Please use this identifier to cite or link to this item:
http://hdl.handle.net/2031/4403
|
| Title: | Web document analysis and its application to anti-phishing |
| Other Titles: | Wang lu wen dang fen xi ji qi zai fan wang diao shi qi zha fang mian de ying yong 網路文檔分析及其在反網釣式欺詐方面的應用 |
| Authors: | Huang, Guanglin (黃光霖) |
| Department: | Dept. of Computer Science |
| Degree: | Master of Philosophy |
| Issue Date: | 2006 |
| Publisher: | City University of Hong Kong |
| Subjects: | Internet -- Security measures Phishing |
| Notes: | CityU Call Number: TK5105.875.I57 H827 2006 Includes bibliographical references (leaves 68-75) Thesis (M.Phil.)--City University of Hong Kong, 2006 vii, 75 leaves : ill. ; 30 cm. |
| Type: | Thesis |
| Abstract: | The Web is growing at an astonishing speed and it is now the largest information and knowledge repository. Many web documents are accumulated, which need automatic processing and analysis for intelligent applications. In this thesis, we investigate the web document analysis technique and also develop an application to anti-phishing. For Web document analysis, a visual factor based page segmentation approach is proposed and implemented. Based on the W3C DOM model of HTML, this approach first decomposes the whole web page into many independent salient blocks, which are visually and semantically consistent within each block but distinguishable between adjacent blocks. In the second step, the approach aggregates these salient blocks into semantically meaningful blocks according their positions and visual cues in the web page. In such as bottom-up manner, the approach final builds up a hierarchical segmented blocks tree. We apply our webpage segmentation to the Anti-Phishing problem. Phishing webpages usually exhibit similar visual styles and structure with their target ones. Based on web page segmentation, we propose three metrics (block level similarity, layout similarity, and overall style similarity) to evaluate the visual similarities between a phishing page and its target. If one of them exceeds a specific threshold, a phishing alarm is issued. We have built up a prototype system to demonstrate the business model of our anti-phishing mechanism, and believe our strategy can be utilized as an enterprise solution for anti-phishing. |
| Online Catalog Link: | http://lib.cityu.edu.hk/record=b2107096 |
| Appears in Collections: | CS - Master of Philosophy
|
Items in CityU IR are protected by copyright, with all rights reserved, unless otherwise indicated.
|