Please use this identifier to cite or link to this item:
http://dspace.cityu.edu.hk/handle/2031/9030
Title: | Heterogenous web data crawling and analysis using probabilistic modeling |
Authors: | Xie, Zhiyao |
Department: | Department of Electronic Engineering |
Issue Date: | 2017 |
Supervisor: | Supervisor: Prof. Chow, Tommy W S; Assessor: Dr. Cheung, Ray C C |
Abstract: | Social media services like Twitter and Facebook have had a tremendous impact on society and politics. On these platforms millions of short messages covering a variety of topics are created everyday. Opinion mining on such large amount data with machine learning algorithms is of high practical value. An application named 'twiOpinion' is developed, currently available on Linux and Mac OS. It accesses Twitter, a social networking service, performs data crawling and binary supervised classification on crawled tweets, contents on Twitter. It provides userfriendly graphics interface and does not require users to do programming or install dependencies. Usage of twiOpinion is covered. With twiOpinion, most influential political events in 2016, like the US president election, are analysed, in order to verify the ability of twiOpinion in learning public opinions reflected by short political discussions, typically less than 140 characters. Supervised learning algorithms provided by twiOpinion, including Support Vector Machine, Naive Bayes Classifier and Decision Trees are all applied to crawled messages. Performance of classifiers is carefully evaluated. Based on classification result linguistic sentimental analysis is performed and large amount of relevant user information is crawled and visualised. |
Appears in Collections: | Electrical Engineering - Undergraduate Final Year Projects |
Files in This Item:
File | Size | Format | |
---|---|---|---|
fulltext.html | 147 B | HTML | View/Open |
Items in Digital CityU Collections are protected by copyright, with all rights reserved, unless otherwise indicated.