Please use this identifier to cite or link to this item:
http://dspace.cityu.edu.hk/handle/2031/537
Title: | Document clustering in email client system |
Authors: | Chang, Matthew Chor Ming |
Department: | Department of Computer Science |
Issue Date: | 2003 |
Supervisor: | Dr. C.K. Poon. First Reader: Dr. Y.T. Yu. Second Reader: Pro. Horace IP |
Abstract: | In this project, I study the problem of email clustering. I have implemented an email client system, CEMail system, for addressing and testing an email clustering algorithm with specially consideration of the email characteristics. I have first found out the characteristics of email characteristics which I can distinguish from other kinds of document clustering. The characteristics help to design an algorithm with a high accurate rate for specially classifying emails. I then apply the notion of resemblance to measure the similarity between emails. Then the emails are classified by using k-nearest neighbor classification model. I will show the efficiency on the supervised learning rate, the comparison with Naïve Bayesian classification method, the elimination of heavy pre-processing, and the high accurate rate by using the approach. I have realized the approach by implementing an application called CEMail system in Java. I have also tested it by clustering the incoming email corpus which is received from various companies for a period of time. I found the accurate rate of clustering is as high as 95%. |
Appears in Collections: | Computer Science - Undergraduate Final Year Projects |
Files in This Item:
File | Size | Format | |
---|---|---|---|
fulltext.html | 164 B | HTML | View/Open |
Items in Digital CityU Collections are protected by copyright, with all rights reserved, unless otherwise indicated.