|
|
CityU Institutional Repository >
CityU Electronic Theses and Dissertations >
ETD - Dept. of Computer Science >
CS - Doctor of Philosophy >
Please use this identifier to cite or link to this item:
http://hdl.handle.net/2031/6222
|
| Title: | Algorithms and techniques for large-scale near-duplicate video mining |
| Other Titles: | Da gui mo jin si shi pin jian suo de suan fa ji ji shu yan jiu 大規模近似視頻檢索的算法及技術研究 |
| Authors: | Zhao, Wanlei (趙萬磊) |
| Department: | Department of Computer Science |
| Degree: | Doctor of Philosophy |
| Issue Date: | 2010 |
| Publisher: | City University of Hong Kong |
| Subjects: | Data mining. Multimedia systems. |
| Notes: | CityU Call Number: QA76.9.D343 Z43 2010 xi, 155 leaves : ill. (some col.) 30 cm. Thesis (Ph.D.)--City University of Hong Kong, 2010. Includes bibliographical references (leaves 143-152) |
| Type: | thesis |
| Abstract: | As bandwidth accessible to average users is increasing, video has become one
of the fastest growing data type in Internet. Especially with the popularity of social
media, there has been exponential growth in videos available on the Web. Among
these huge volumes of videos, there exist large numbers of near-duplicates and copies.
Near-duplicates (NDs) carry both blessing and redundant signals. For example, ND
provides rich visual clue for indexing and summarizing broadcast videos from different
channels. On the other hand, the excessive amount of ND makes browsing web
videos streamed over Internet an extremely time-consuming task. In this thesis, we
conduct studies to research the algorithms and techniques for large-scale mining of
near-duplicate videos. The issues we study include feature indexing and matching,
pattern modeling, scalable detection and mining for different emerging multimedia applications.
We begin by investigating the matching algorithm of near-duplicate keyframes
(NDK) using local interest points (LIP). We propose one-to-one symmetric (OOS)
matching algorithm for fault-tolerant based NDK identification. To deal with indexing
problem, we also present a new indexing algorithm, named LIP-IS, for efficient
LIP matching. Comparative study is given to contrast LIP-IS with a more popular
indexing technique, Locality Sensitive Hashing (LSH). To model the patterns formed
by LIP matching, we propose two different techniques, supervised and unsupervised,
for efficient learning. In the supervised version, we capture LIP matching pattern as
a histogram and adopt support vector machines (SVM) to learn the pattern for NDK
classification. In the unsupervised version, we novelly propose an entropy-based evaluation
to assess the degree of near-duplicate. Two algorithms, named pattern entropy
(PE) and scale-rotation invariant PE (SR-PE), are studied accordingly for unsupervised
learning of LIP matching pattern.
In view that LIP matching is computationally expensive, especially when there are
excessive number of LIP to match between two keyframes, we further study visual keyword (VK) matching for NDK identification. VK models LIP in a keyframe as a
bag-of-words, and each keyframe is represented as a histogram of words. Thus, instead
of performing nearest neighbor based matching, direct matching of words can be efficiently
conducted. Especially with the indexing structure: inverted file index, and the
refinement technique: Hamming embedding, LIP collected from large video dataset
can be efficiently indexed, searched and matched. Nevertheless, the effectiveness and
accuracy of matching remains a problem for VK. We thus study different algorithms
and techniques to improve the matching effectiveness of LIP. These include enhanced
weak geometric consistency checking (E-WGC) algorithm for matching verification,
and reverse entropy (RE) technique for frame aggregation.
Finally, we demonstrate our works for two applications: video copy detection,
and automatic video tagging. In copy detection, we demonstrate very competitive
detection performance compared to state-of-the-art reported results on TRECVID copy
detection benchmark. In video tagging, we demonstrate two types of tagging: crosssource
tagging to automatically mine tags from one search engine to enrich tags of
videos in another search engine; and cross-time tagging to automatically tag newly
uploaded videos according to the historical video data collected over years. |
| Online Catalog Link: | http://lib.cityu.edu.hk/record=b3947825 |
| Appears in Collections: | CS - Doctor of Philosophy
|
Items in CityU IR are protected by copyright, with all rights reserved, unless otherwise indicated.
|