CityU Institutional Repository >
3_CityU Electronic Theses and Dissertations >
ETD  Dept. of Mathematics >
MA  Doctor of Philosophy >
Please use this identifier to cite or link to this item:
http://hdl.handle.net/2031/6634

Title:  Learning algorithms producing sparse approximations 
Other Titles:  Chan sheng xi shu bi jin de xue xi suan fa 產生稀疏逼近的學習算法 
Authors:  Guo, Xin ( 郭昕) 
Department:  Department of Mathematics 
Degree:  Doctor of Philosophy 
Issue Date:  2011 
Publisher:  City University of Hong Kong 
Subjects:  Machine learning. 
Notes:  CityU Call Number: Q325.5 .G86 2011 v, 90 leaves 30 cm. Thesis (Ph.D.)City University of Hong Kong, 2011. Includes bibliographical references (leaves 8189) 
Type:  thesis 
Abstract:  A class of learning algorithms for regression is studied. They are modified kernel
projection machines in a least squares ℓqregularization scheme with 0 < q ≤ 1,
on a data dependent hypothesis space spanned by empirical features (constructed
by a reproducing kernel and the learning data). The algorithms have three advantages.
First, they do not involve any high dimensional optimization process,
thus the computational complexity is reduced, which also makes it easy to adjust
the regularization parameter by, e.g., cross validation approaches. Second, they
produce sparse representations with respect to empirical features under a mild
condition, without assuming sparsity of the regression function in terms of any
basis or system. Third, the output function converges to the regression function
in the reproducing kernel Hilbert space at a satisfactory rate which is explicitly
given.
We analyze the algorithm with ℓ1 penalty first. Our analysis shows that
while having sparsity, the output function converges to the underlying regression
function. The convergent rate O(mε½) is obtained in two different cases where
we assume the eigenvalues of the integrate operator generated by the Mercer
kernel decay polynomially and exponentially respectively.
We then study the algorithm with the general ℓq penalty where 0 < q ≤ 1.
Our goal is to understand the in
uence of the regularizing parameter on both
learning rate and sparsity. Our analysis suggests that as q decreases to zero, the
sparsity increases while the approximating ability is weakened.
We also study the algorithm in a special probability model where the noise
is assumed to be independent of the sampling place. The goal is to obtain better learning rates as well as sparsity. Our analysis shows that in this model, the
sparsity is independent of the index q while the learning rate power exponent
takes the form 1/2+O(1/r)when r is large, and the index q appears only in the term O(1/r). 
Online Catalog Link:  http://lib.cityu.edu.hk/record=b4086783 
Appears in Collections:  MA  Doctor of Philosophy

Items in CityU IR are protected by copyright, with all rights reserved, unless otherwise indicated.
