City University of Hong Kong

CityU Institutional Repository >
3_CityU Electronic Theses and Dissertations >
ETD - Dept. of Computer Science  >
CS - Doctor of Philosophy  >

Please use this identifier to cite or link to this item:

Title: Large scale semantic concept detection, fusion, and selection for domain adaptive video search
Other Titles: Da gui mo yu yi gai nian de jian ce, rong he ji xuan ze jin xing shu ju yu zi shi ying shi pin jian suo
大規模語義概念的檢測, 融合及選擇進行數據域自適應視頻檢索
Authors: Jiang, Yugang (姜育剛)
Department: Department of Computer Science
Degree: Doctor of Philosophy
Issue Date: 2009
Publisher: City University of Hong Kong
Subjects: Optical pattern recognition.
Image processing -- Digital techniques.
Notes: CityU Call Number: TA1650 .J53 2009
xv, 161 leaves : ill. 30 cm.
Thesis (Ph.D.)--City University of Hong Kong, 2009.
Includes bibliographical references (leaves 145-161)
Type: thesis
Abstract: This thesis investigates the problem of video search based on semantic concepts. We present approaches to handle three correlated issues that are critical to this problem: (1) how to construct an e®ective feature representation for semantic concept detection, (2) how to exploit semantic context to improve the detection of these concepts, and (3) how to select the most suitable concept detectors to answer user queries. In particular, as the target videos may come from di®erent domains (genres or sources) with distinctive data characteristics, for each of the issues, we will need to cope with the domain changes. Video frames are represented by bag-of-visual-words (BoW) derived from local keypoint features, which are invariant to rotation, scale and illumination. We ¯rst conduct a comprehensive study on the representation choices of BoW, including vocabulary size, weighting scheme, stop word removal, feature selection, spatial information, and visual bi-gram. The aim is to o®er practical insights in how these choices will impact the performance of BoW for semantic concept detec- tion. We also show how to further augment the BoW representation by exploring the linguistic and ontological aspects of visual words. A visual-word ontology is constructed to hierarchically specify their hyponym relationship, which is incor- porated into BoW for improved video frame representation. To exploit semantic context, we develop a novel and e±cient domain adap- tive semantic di®usion algorithm. Inter-concept relationship is modeled using a semantic graph, which treats concepts as nodes and the concept a±nities as the weights of edges. It is then applied to re¯ne the initial detection results through a function level graph di®usion process, aiming to recover the consistency and smoothness of the detection results over the graph. To handle the domain change between training and test sets, our algorithm involves a graph adaptation pro- cess which iteratively re¯nes the concept a±nity based on the target domain data characteristics. This algorithm is e±cient and scalable to large scale data sets. For the selection of concept detectors, we focus on exploring heterogeneous knowledge sources for better measurement of query-detector similarity. Instead of using WordNet as in most existing works, we exploit the context information associated with Flickr images to estimate the similarity between queries and con- cept detectors. This similarity measure, named FCS, re°ects the word correlation in images rather than text corpora. With an initial detector set selected by FCS for each query, we further propose a semantic context transfer algorithm that adapts the query-detector similarity to a target data set. The adaptation process is highly e±cient, satisfying the critical requirement of online video search. We evaluate all the proposed techniques on large scale video search bench- marks provided by TRECVID from years 2005 to 2008. Experimental evalua- tions demonstrate promising results of our techniques, and their potential to be applied to other applications such as visual object categorization and web scale image retrieval.
Online Catalog Link:
Appears in Collections:CS - Doctor of Philosophy

Files in This Item:

File Description SizeFormat
abstract.html132 BHTMLView/Open
fulltext.html132 BHTMLView/Open

Items in CityU IR are protected by copyright, with all rights reserved, unless otherwise indicated.


Valid XHTML 1.0!
DSpace Software © 2013 CityU Library - Send feedback to Library Systems
Privacy Policy · Copyright · Disclaimer