City University of Hong Kong
DSpace
 

CityU Institutional Repository >
3_CityU Electronic Theses and Dissertations >
ETD - Dept. of Computer Science  >
CS - Doctor of Philosophy  >

Please use this identifier to cite or link to this item: http://hdl.handle.net/2031/5784

Title: Concept-based video search by semantic and context reasoning
Other Titles: Shi yong yu yi he yu jing jin xing tui li de ji yu gai nian de shi pin jian suo
使用語義和語境進行推理的基於概念的視頻檢索
Authors: Wei, Xiaoyong (魏驍勇)
Department: Department of Computer Science
Degree: Doctor of Philosophy
Issue Date: 2009
Publisher: City University of Hong Kong
Subjects: Optical pattern recognition.
Image processing -- Digital techniques.
Notes: CityU Call Number: TA1650 .W44 2009
xiv, 133 leaves : ill. 30 cm.
Thesis (Ph.D.)--City University of Hong Kong, 2009.
Includes bibliographical references (leaves 122-133)
Type: thesis
Abstract: Semantic-based video retrieval has long been recognized as one of the hardest problems in multimedia computing. The challenges include 1) the lack of coincidence between low-level features and user expectations, which gives rise to the problem of "semantic gap", 2) most users get used to text-based queries, while the effectiveness of searching for videos with a few text keywords remains questionable. One popular search methodology which addresses these challenges is Concept-based Video Search (CBVS), where a set of semantic concept detectors are developed for predicting query semantics. Under CBVS, detectors which could interpret search intentions are selected and fused by reasoning for query answering. This thesis addresses three main open research issues related to CBVS in concept detector selection and fusion, specifically which and how many detectors should be selected for answering a given text query, and how to fuse them. Two novel spaces, namely semantic space and context space, are proposed and developed to provide computable platforms for inter-concept relationship modeling and reasoning. With these spaces, detectors can be uniformly reasoned and fused together for large-scale video search. This thesis first proposes a novel construction of semantic space to determine concept similarity globally. In contrast to conventional ontology reasoning such as WordNet, this space enables a uniform and global similarity to measure inter-concept relationship. In this space, basis vectors are formed by modeling ontological relationship among concepts. Each concept is represented as a vector for measuring similarity. Because ontology knowledge is taken into account when building the semantic space, we call the space "ontology enriched". We propose two variants of semantic space by considering the orthogonality property of the space. The first space is named Ontology-enriched Semantic Space (OSS), while the second space is called Ontology-enriched Orthogonal Semantic Space (OS2). Both OSS and OS2 are successfully demonstrated for several tasks including concept detector selection, word sense disambiguation and search. In addition to semantic space, context space is proposed to address the fact that semantic concepts do not exist in isolation but are correlated to each other. Using such context relationship can also greatly facilitate concept selection and fusion. The developed context space considers the global consistency of concept relationships, addresses the problem of missing annotation, and is extensible for cross-domain detector fusion. The space can be built by modeling the inter-concept relationship through annotation provided by either manual labeling or machine tagging. Context space has been successfully demonstrated for the task of Context-based Concept Fusion (CBCF) in both concept detector development and search. With the semantic space and context space, a novel multi-level fusion framework is then proposed for CBVS. The framework for answering queries considers different aspects including the semantic, context, reliability and diversity of detectors. In the concept selection step, the number of appropriate detectors is adaptively determined by joint reasoning in semantic and context spaces. In the fusion step, the selected detectors are combined hierarchically where each level of fusion emphasizes one aspect of the detectors. Experimental results obtained using our methodology on TRECVID datasets of years 2005 to 2008 have been very encouraging and demonstrate state-of-the-art performance for CBVS.
Online Catalog Link: http://lib.cityu.edu.hk/record=b2375050
Appears in Collections:CS - Doctor of Philosophy

Files in This Item:

File Description SizeFormat
abstract.html132 BHTMLView/Open
fulltext.html132 BHTMLView/Open

Items in CityU IR are protected by copyright, with all rights reserved, unless otherwise indicated.

 

Valid XHTML 1.0!
DSpace Software © 2013 CityU Library - Send feedback to Library Systems
Privacy Policy · Copyright · Disclaimer