City University of Hong Kong

CityU Institutional Repository >
3_CityU Electronic Theses and Dissertations >
ETD - Dept. of Computer Science  >
CS - Doctor of Philosophy  >

Please use this identifier to cite or link to this item:

Title: Algorithms and techniques for large-scale near-duplicate video mining
Other Titles: Da gui mo jin si shi pin jian suo de suan fa ji ji shu yan jiu
Authors: Zhao, Wanlei (趙萬磊)
Department: Department of Computer Science
Degree: Doctor of Philosophy
Issue Date: 2010
Publisher: City University of Hong Kong
Subjects: Data mining.
Multimedia systems.
Notes: CityU Call Number: QA76.9.D343 Z43 2010
xi, 155 leaves : ill. (some col.) 30 cm.
Thesis (Ph.D.)--City University of Hong Kong, 2010.
Includes bibliographical references (leaves 143-152)
Type: thesis
Abstract: As bandwidth accessible to average users is increasing, video has become one of the fastest growing data type in Internet. Especially with the popularity of social media, there has been exponential growth in videos available on the Web. Among these huge volumes of videos, there exist large numbers of near-duplicates and copies. Near-duplicates (NDs) carry both blessing and redundant signals. For example, ND provides rich visual clue for indexing and summarizing broadcast videos from different channels. On the other hand, the excessive amount of ND makes browsing web videos streamed over Internet an extremely time-consuming task. In this thesis, we conduct studies to research the algorithms and techniques for large-scale mining of near-duplicate videos. The issues we study include feature indexing and matching, pattern modeling, scalable detection and mining for different emerging multimedia applications. We begin by investigating the matching algorithm of near-duplicate keyframes (NDK) using local interest points (LIP). We propose one-to-one symmetric (OOS) matching algorithm for fault-tolerant based NDK identification. To deal with indexing problem, we also present a new indexing algorithm, named LIP-IS, for efficient LIP matching. Comparative study is given to contrast LIP-IS with a more popular indexing technique, Locality Sensitive Hashing (LSH). To model the patterns formed by LIP matching, we propose two different techniques, supervised and unsupervised, for efficient learning. In the supervised version, we capture LIP matching pattern as a histogram and adopt support vector machines (SVM) to learn the pattern for NDK classification. In the unsupervised version, we novelly propose an entropy-based evaluation to assess the degree of near-duplicate. Two algorithms, named pattern entropy (PE) and scale-rotation invariant PE (SR-PE), are studied accordingly for unsupervised learning of LIP matching pattern. In view that LIP matching is computationally expensive, especially when there are excessive number of LIP to match between two keyframes, we further study visual keyword (VK) matching for NDK identification. VK models LIP in a keyframe as a bag-of-words, and each keyframe is represented as a histogram of words. Thus, instead of performing nearest neighbor based matching, direct matching of words can be efficiently conducted. Especially with the indexing structure: inverted file index, and the refinement technique: Hamming embedding, LIP collected from large video dataset can be efficiently indexed, searched and matched. Nevertheless, the effectiveness and accuracy of matching remains a problem for VK. We thus study different algorithms and techniques to improve the matching effectiveness of LIP. These include enhanced weak geometric consistency checking (E-WGC) algorithm for matching verification, and reverse entropy (RE) technique for frame aggregation. Finally, we demonstrate our works for two applications: video copy detection, and automatic video tagging. In copy detection, we demonstrate very competitive detection performance compared to state-of-the-art reported results on TRECVID copy detection benchmark. In video tagging, we demonstrate two types of tagging: crosssource tagging to automatically mine tags from one search engine to enrich tags of videos in another search engine; and cross-time tagging to automatically tag newly uploaded videos according to the historical video data collected over years.
Online Catalog Link:
Appears in Collections:CS - Doctor of Philosophy

Files in This Item:

File Description SizeFormat
abstract.html132 BHTMLView/Open
fulltext.html132 BHTMLView/Open

Items in CityU IR are protected by copyright, with all rights reserved, unless otherwise indicated.


Valid XHTML 1.0!
DSpace Software © 2013 CityU Library - Send feedback to Library Systems
Privacy Policy · Copyright · Disclaimer