City University of Hong Kong

CityU Institutional Repository >
3_CityU Electronic Theses and Dissertations >
ETD - Dept. of Chinese, Translation and Linguistics  >
CTL - Doctor of Philosophy  >

Please use this identifier to cite or link to this item:

Title: Automatic text analysis using rhetorical structure theory with application for information search and retrieval
Other Titles: Ji yu xiu ci jie gou li lun de zi dong wen zhang fen xi ji xun xi sou suo
Authors: Wong, Cecilia Shuk Man (王淑雯)
Department: Dept. of Chinese, Translation and Linguistics
Degree: Doctor of Philosophy
Issue Date: 2004
Publisher: City University of Hong Kong
Subjects: Discourse analysis -- Data processing
Information storage and retrieval systems
Notes: CityU Call Number: P302.3.W66 2004
Includes bibliographical references (leaves 70-74)
Thesis (Ph.D.)--City University of Hong Kong, 2004
vi, 110 leaves : ill. ; 30 cm.
Type: Thesis
Abstract: In addition to the typical methods employed in information retrieval systems, e.g. calculating frequency of keywords, pattern matching involving keywords, in this research project, I am proposing an approach to information search and retrieval based not only on the basic element set known as the Dublin Core Metadata Element Set (DCMES), which represents the content or bibliographical information of the data, but also based on the identification of linguistic information about the rhetorical structure of the text. This rhetorical structure information may be inferred from linguistic clues identified in the text. Both types of information are encoded as rules and facts in F-Logic (Frame-Logic). The cues and criteria in identifying rhetorical structure information are based on those developed by Corston-Oliver(1998). The text base in question consists of abstracts of linguistics journal articles drawn from a collection of over three hundred papers on the topic of Chinese Linguistics. Included in the text base are abstracts from linguistics journals in both Chinese and English. Information retrieval is web-based. Besides offering a search and retrieval capability, the application can also be extended by developing a web interface for authors or publishers to submit their abstracts to the text base. As the data in this research is linguistic abstracts, part of the focus of the research would be the investigation and analysis on the text structure of the abstracts. Since the usual way of creating an abstract is to extract all the main ideas of the text being described, analyzing abstracts in terms of their structure will be helpful in determining the structure of the whole article upon which each abstract is based. By identifying the relations among the different spans in the abstracts, one can be able to realize the general structure of the whole article. In other words, investigating and analyzing the text structure of discourse in the smaller-scale, i.e. the abstracts makes it possible to gain insight into those of larger-scale discourse, i.e. the papers. The research serves to further the development of ‘smart’ search facilities through the use of linguistic knowledge about the text. We have based our approach on the existence of a correlation between the move and rhetorical structures of texts. The result of the research has demonstrated support for the validity of this assumption.
Online Catalog Link:
Appears in Collections:CTL - Doctor of Philosophy

Files in This Item:

File Description SizeFormat
fulltext.html158 BHTMLView/Open
abstract.html158 BHTMLView/Open

Items in CityU IR are protected by copyright, with all rights reserved, unless otherwise indicated.


Valid XHTML 1.0!
DSpace Software © 2013 CityU Library - Send feedback to Library Systems
Privacy Policy · Copyright · Disclaimer