CityU Institutional Repository >
CityU Electronic Theses and Dissertations >
ETD - Dept. of Computer Science >
CS - Doctor of Philosophy >
Please use this identifier to cite or link to this item:
|Title: ||On clustering, detection, and threading of topics for large scale videos with multiple modalities|
|Other Titles: ||Tong guo duo mo tai ju lei, jian ce yi ji xian suo da gui mo shi pin zhu ti|
|Authors: ||Wu, Xiao (吳曉)|
|Department: ||Department of Computer Science|
|Degree: ||Doctor of Philosophy|
|Issue Date: ||2008|
|Publisher: ||City University of Hong Kong|
|Subjects: ||Digital video.|
|Notes: ||xi, 154 leaves : col. ill. 30 cm.|
Thesis (Ph.D.)--City University of Hong Kong, 2008.
Includes bibliographical references (leaves 144-152)
CityU Call Number: TK6680.5 .W83 2008
|Abstract: ||With the popularity of social media particularly in Web 2.0, the amount of videos in the Internet has grown exponentially. The videos are easily accessible, can be formatted, duplicated, edited and presented in a variety of ways. In this thesis, we study the usefulness of visual near-duplicates in exploiting various tasks related to both MM (multimedia) and TDT (topic detection and tracking). MM and TDT are previously studied, but separately in multimedia and information retrieval community. We investigate the TDT issues in the multiple modality setting, particularly with visual near-duplicates as an underlying constraint, for the clustering, detection, threading and re-ranking of videos across sources, languages and countries.
We begin by studying the retrieval of visual near-duplicates in news video corpus. The visual keywords generated from local features of keyframes are utilized and modeled with cosine similarity and language models. By incorporating textual information derived from semantic context, we demonstrate the advantages of retrieving near-duplicates in multi-modality setting, compared with the state-of-the-art techniques. We further treat visual duplicates as a constraint in clustering news stories in large-scale videos. We explore several techniques including the co-clustering of stories in low-dimensional space with constraint, and the adaptive density-based clustering driven by constraints. These techniques are rigorously integrated and then named as CCC (constraint driven co-clustering) which enables the effective clustering of news stories with varying shapes, sizes and densities. With stories clustered according to topics, we also investigate the representation of near-duplicate as a visual language to detect the novelty and redundancy of stories in multi-modal and multi-lingual environment. Various models are experimented to compare and contrast the fundamental difference of visual and text languages. Finally, we suggest a novel driven organization of topic to a structural tree which facilitates the summarization and browsing of topic events.
As a byproduct of our study in this thesis, we also apply the proposed techniques to re-rank videos obtained from YouTube, Yahoo! and Google, by rapid finding of visual near-duplicates. A hierarchical approach integrating color signatures and local features is thus proposed for the efficient and accurate filtering of near-duplicates in Web environment.
In brief, our theoretical and practical findings show that visual near-duplicates can be exploited in the multiple stages of TDT tasks, either as features, as a constraint, or as a language. Bringing MM resources to TDT tasks benefit the organization of broadcast videos with many near-duplicates across sources, and also the novelty ranking of web videos with many near-duplicate versions intertwined.|
|Online Catalog Link: ||http://lib.cityu.edu.hk/record=b2268794|
|Appears in Collections:||CS - Doctor of Philosophy |
Items in CityU IR are protected by copyright, with all rights reserved, unless otherwise indicated.