|
CityU Institutional Repository >
CityU Electronic Theses and Dissertations >
ETD - Dept. of Chinese, Translation and Linguistics >
CTL - Doctor of Philosophy >
Please use this identifier to cite or link to this item:
http://hdl.handle.net/2031/4344
|
| Title: | Automatic text analysis using rhetorical structure theory with application for information search and retrieval |
| Other Titles: | Ji yu xiu ci jie gou li lun de zi dong wen zhang fen xi ji xun xi sou suo 基於修辭結構理論的自動文章分析及訊息搜索 |
| Authors: | Wong, Cecilia Shuk Man (王淑雯) |
| Department: | Dept. of Chinese, Translation and Linguistics |
| Degree: | Doctor of Philosophy |
| Issue Date: | 2004 |
| Publisher: | City University of Hong Kong |
| Subjects: | Discourse analysis -- Data processing Information storage and retrieval systems Rhetoric |
| Notes: | CityU Call Number: P302.3.W66 2004 Includes bibliographical references (leaves 70-74) Thesis (Ph.D.)--City University of Hong Kong, 2004 vi, 110 leaves : ill. ; 30 cm. |
| Type: | Thesis |
| Abstract: | In addition to the typical methods employed in information retrieval systems, e.g. calculating frequency of keywords, pattern matching involving keywords, in this research project, I am proposing an approach to information search and retrieval based not only on the basic element set known as the Dublin Core Metadata Element Set (DCMES), which represents the content or bibliographical information of the data, but also based on the identification of linguistic information about the rhetorical structure of the text. This rhetorical structure information may be inferred from linguistic clues identified in the text. Both types of information are encoded as rules and facts in F-Logic (Frame-Logic). The cues and criteria in identifying rhetorical structure information are based on those developed by Corston-Oliver(1998). The text base in question consists of abstracts of linguistics journal articles drawn from a collection of over three hundred papers on the topic of Chinese Linguistics. Included in the text base are abstracts from linguistics journals in both Chinese and English. Information retrieval is web-based. Besides offering a search and retrieval capability, the application can also be extended by developing a web interface for authors or publishers to submit their abstracts to the text base. As the data in this research is linguistic abstracts, part of the focus of the research would be the investigation and analysis on the text structure of the abstracts. Since the usual way of creating an abstract is to extract all the main ideas of the text being described, analyzing abstracts in terms of their structure will be helpful in determining the structure of the whole article upon which each abstract is based. By identifying the relations among the different spans in the abstracts, one can be able to realize the general structure of the whole article. In other words, investigating and analyzing the text structure of discourse in the smaller-scale, i.e. the abstracts makes it possible to gain insight into those of larger-scale discourse, i.e. the papers. The research serves to further the development of ‘smart’ search facilities through the use of linguistic knowledge about the text. We have based our approach on the existence of a correlation between the move and rhetorical structures of texts. The result of the research has demonstrated support for the validity of this assumption. |
| Online Catalog Link: | http://lib.cityu.edu.hk/record=b1871332 |
| Appears in Collections: | CTL - Doctor of Philosophy
|
Items in CityU IR are protected by copyright, with all rights reserved, unless otherwise indicated.
|