CityU Institutional Repository >
CityU Electronic Theses and Dissertations >
ETD - Dept. of Computer Science >
CS - Doctor of Philosophy >
Please use this identifier to cite or link to this item:
|Title: ||Kernel and spectral methods for representation and learning in image understanding|
|Other Titles: ||Tu xiang li jie zhong ji yu he fang fa he pu fang fa de biao shi he xue xi yan suan fa yan jiu|
|Authors: ||Lu, Zhiwu ( 盧志武)|
|Department: ||Department of Computer Science|
|Degree: ||Doctor of Philosophy|
|Issue Date: ||2011|
|Publisher: ||City University of Hong Kong|
|Subjects: ||Image processing -- Digital techniques.|
|Notes: ||CityU Call Number: TA1637 .L84 2011|
xvi, 163 leaves : ill. (some col.) 30 cm.
Thesis (Ph.D.)--City University of Hong Kong, 2011.
Includes bibliographical references (leaves 149-163)
|Abstract: ||This thesis investigates the challenging representation and learning problems that
occur in many image understanding tasks such as image categorization, annotation,
and retrieval. To reduce the semantic gap which is the key and open issue of image
understanding, we propose novel representation and learning approaches to image
understanding based on kernel and spectral methods. The distinct advantage of our
kernel and spectral methods is that they can be readily combined with other machine
learning techniques which are widely used in image understanding.
To capture the context within images, we propose spatial Markov kernels using
the image representation with visual keywords. Based on 2D Markov models, the spatial
dependencies between visual keywords are incorporated into two different kernels,
which differ in whether the class labels of training images are considered for kernel definition.
Our spatial Markov kernels can be applied to different image understanding
tasks such as image categorization and annotation.
We further present a novel semantics-aware image representation which is derived
from but beyond the traditional bag-of-features representation. Specifically, we learn
latent semantics automatically from a large vocabulary of visual keywords through
contextual spectral embedding by exploiting two types of context between visual
keywords for graph construction. The learnt latent semantics can provide a more
succinct representation but a richer descriptor than the visual keywords.
Based on our spatial Markov kernels, we propose an exhaustive and efficient constraint
propagation approach to weakly-supervised image categorization. Different
from most previous methods that are limited to two-class problems or using only must-link constraints, our exhaustive and efficient constraint propagation approach
can be seen as a very general technique which is free from such limitations. More
significantly, we first explicitly show how pairwise constraints are propagated independently
and then accumulated into a conciliatory closed-form solution.
Moreover, our spatial Markov kernels can also be applied to interactive image
categorization. The context across visual keywords within an image is first captured
by our spatial Markov kernels. After graph construction with our kernels, the large
unlabeled data can be exploited by graph-based semi-supervised learning through
label propagation with inter-image consistency. For interactive image categorization,
we further combine this semi-supervised learning with active learning by defining a
new diversity-based data selection criterion using spectral embedding. In this way,
we succeed in developing a novel graph-based framework which can exploit context,
consistency, and diversity cues for interactive image categorization.
Although the above proposed kernel and spectral methods for image understanding
are shown to achieve superior performance on a number of benchmark image
datasets, we also need to demonstrate their potential to be used in other applications
such as action recognition. Following the idea of latent semantic learning for
image understanding by contextual spectral embedding, we further propose two novel
spectral methods to learn latent semantics for action recognition by parameter-free
spectral embedding with sparse representation and hypergraphs, respectively. The
superior performance of these two spectral methods on unconstrained videos verifies
that our proposed methods for image understanding can be extended to semantic
understanding of other types of multimedia content.|
|Online Catalog Link: ||http://lib.cityu.edu.hk/record=b4086654|
|Appears in Collections:||CS - Doctor of Philosophy |
Items in CityU IR are protected by copyright, with all rights reserved, unless otherwise indicated.