City University of Hong Kong
DSpace
 

CityU Institutional Repository >
3_CityU Electronic Theses and Dissertations >
ETD - Dept. of Computer Science  >
CS - Doctor of Philosophy  >

Please use this identifier to cite or link to this item: http://hdl.handle.net/2031/6089

Title: Research of passage retrieval for question answering system
Other Titles: Mian xiang wen da xi tong de duan luo jian suo ji shu yan jiu
面向问答系统的段落检索技术研究
Authors: Li, Xin (黎新)
Department: Department of Computer Science
Degree: Doctor of Philosophy
Issue Date: 2010
Publisher: City University of Hong Kong
Subjects: Question-answering systems.
Text processing (Computer science)
Notes: CityU Call Number: QA76.9.Q4 L5 2010
viii, 86 leaves : ill. 30 cm.
Thesis (Ph.D.)--City University of Hong Kong, 2010.
Includes bibliographical references (leaves 77-86)
Type: thesis
Abstract: The quick development of the Web has made it a huge information source and an important platform in which people can exchange and share knowledge. For example, users can easily acquire information from the Web with the help of search engines. However, the information in the Web is so huge that it is difficult for users to identify and select valuable information. Hence, how to accurately retrieve and extract the information needed by users has been an important research topic. Question Answering (QA) system has been an important research topic which is the important research direction for next generation search engines. The features of question answering system are: firstly, it allows users to submit a query using natural language question instead of keywords; secondly, the responses to users are concise and precise answer instead of a list of documents. Users can accurately describe their information requirement and QA systems can understand the users' needs and make correct response. The document retrieval module is an important component of the Web-based QA system. Usually, the retrieved documents undergoes several computation-intensive procedures including natural language processing, information extraction, and pattern matching, to determine the most likely answers. It could be more efficient if QA systems reduce the size of each document to be processed. Passage retrieval, which aims to find the text excerpts that may contain the exact answer of the given question, has long been studied in IR and recently has been an important component in QA systems. The passage retrieval module is usually added as an intermediate stage between the document retrieval module and answer extraction module. It can facilitate quick answer extraction and improve the efficiency of answer finding by users. This dissertation firstly analyzes the evaluation methods of document relevance. It can be found that the document relevance is mainly density-based lexical relevance. Hence, the document retrieval methods can not be applied to passage retrieval directly. We discuss the definition of question answering passage retrieval and demonstrate the differences between document retrieval and passage retrieval in the aspects of topic, length and keyword. We then propose some heuristic methods for designing passage retrieval formulas which can be more fit to the requirement of QA systems. Secondly, this dissertation proposes a Web-based question answering passage retrieval method. The dissertation describes the definition of passage retrieval and introduces the basic work-flow and the function of each component. The dissertation proposes a heuristics query rewrite method to solve this problem. The keywords are not considered independently but utilize the constraints relations to perform the keywords matching and calculate the relevance score. Finally, this dissertation proposes a novel mixture relevance model based on multi-features. The dissertation explores the effectiveness of lexical similarity, topic similarity and structure similarity on passage retrieval. A web-based method of computing similarity between words is proposed and it is utilized to calculate the lexical similarity between a question and a passage. The dissertation then proposes a probabilistic topic language model to calculate the topic similarity between a question and a passage. For structure similarity, two structures which are "wh-movement" and "predicate-argument" are mainly considered. We then integrate the three different similarity metrics into a weighted average metrics for evaluation of the relevance of between a passage and a question. Key Words: Web, Question Answering, Passage Retrieval, Lexical Similarity, Topic Language Model, Structure Similarity, Relevance
Online Catalog Link: http://lib.cityu.edu.hk/record=b3947519
Appears in Collections:CS - Doctor of Philosophy

Files in This Item:

File Description SizeFormat
abstract.html132 BHTMLView/Open
fulltext.html132 BHTMLView/Open

Items in CityU IR are protected by copyright, with all rights reserved, unless otherwise indicated.

 

Valid XHTML 1.0!
DSpace Software © 2013 CityU Library - Send feedback to Library Systems
Privacy Policy · Copyright · Disclaimer