Please use this identifier to cite or link to this item:
http://dspace.cityu.edu.hk/handle/2031/8983
Title: | Complex network analysis for language |
Authors: | Fong, Cheuk Ho |
Department: | Department of Electronic Engineering |
Issue Date: | 2018 |
Supervisor: | Supervisor: Dr. Tang, Wallace K S; Assessor: Prof. Chen, Guanrong |
Abstract: | There is growing importance on natural language processing nowadays. This project explored the use of network in the analysis of texts in literature. A software platform was designed and implemented to construct a network of words, based on inputted document, and consequently, functions such as keyword extraction and summarization can be achieved. The software was designed in Python and the 'Natural Language Toolkit' (NLTK) was the most important package which was used for filtering and lemmatization. A weighted adjacency matrix would be obtained, which represented the linkages between nodes(words) in the original document. In order to extract keywords, five centrality measures were used to quantify the influence of every node in the network and obtain the corresponding keyword set. Through analysis, it is observed that those measures may only be suitable for some kinds of literatures or some criteria. By using those sets of keyword, we are able to identify the main theme or key ideas from literature quickly. The developed software shows a great use of network in keyword generations, which are useful in many practical applications. |
Appears in Collections: | Electrical Engineering - Undergraduate Final Year Projects |
Files in This Item:
File | Size | Format | |
---|---|---|---|
fulltext.html | 148 B | HTML | View/Open |
Items in Digital CityU Collections are protected by copyright, with all rights reserved, unless otherwise indicated.