City University of Hong Kong

CityU Institutional Repository >
3_CityU Electronic Theses and Dissertations >
ETD - Dept. of Mathematics  >
MA - Doctor of Philosophy  >

Please use this identifier to cite or link to this item:

Title: Learning algorithms producing sparse approximations
Other Titles: Chan sheng xi shu bi jin de xue xi suan fa
Authors: Guo, Xin ( 郭昕)
Department: Department of Mathematics
Degree: Doctor of Philosophy
Issue Date: 2011
Publisher: City University of Hong Kong
Subjects: Machine learning.
Notes: CityU Call Number: Q325.5 .G86 2011
v, 90 leaves 30 cm.
Thesis (Ph.D.)--City University of Hong Kong, 2011.
Includes bibliographical references (leaves 81-89)
Type: thesis
Abstract: A class of learning algorithms for regression is studied. They are modified kernel projection machines in a least squares ℓq-regularization scheme with 0 < q ≤ 1, on a data dependent hypothesis space spanned by empirical features (constructed by a reproducing kernel and the learning data). The algorithms have three advantages. First, they do not involve any high dimensional optimization process, thus the computational complexity is reduced, which also makes it easy to adjust the regularization parameter by, e.g., cross validation approaches. Second, they produce sparse representations with respect to empirical features under a mild condition, without assuming sparsity of the regression function in terms of any basis or system. Third, the output function converges to the regression function in the reproducing kernel Hilbert space at a satisfactory rate which is explicitly given. We analyze the algorithm with ℓ1 penalty first. Our analysis shows that while having sparsity, the output function converges to the underlying regression function. The convergent rate O(mε-½) is obtained in two different cases where we assume the eigenvalues of the integrate operator generated by the Mercer kernel decay polynomially and exponentially respectively. We then study the algorithm with the general ℓq penalty where 0 < q ≤ 1. Our goal is to understand the in uence of the regularizing parameter on both learning rate and sparsity. Our analysis suggests that as q decreases to zero, the sparsity increases while the approximating ability is weakened. We also study the algorithm in a special probability model where the noise is assumed to be independent of the sampling place. The goal is to obtain better learning rates as well as sparsity. Our analysis shows that in this model, the sparsity is independent of the index q while the learning rate power exponent takes the form 1/2+O(1/r)when r is large, and the index q appears only in the term O(1/r).
Online Catalog Link:
Appears in Collections:MA - Doctor of Philosophy

Files in This Item:

File Description SizeFormat
abstract.html132 BHTMLView/Open
fulltext.html132 BHTMLView/Open

Items in CityU IR are protected by copyright, with all rights reserved, unless otherwise indicated.


Valid XHTML 1.0!
DSpace Software © 2013 CityU Library - Send feedback to Library Systems
Privacy Policy · Copyright · Disclaimer