CityU Institutional Repository >
CityU Electronic Theses and Dissertations >
ETD - Dept. of Mathematics >
MA - Doctor of Philosophy >
Please use this identifier to cite or link to this item:
|Title: ||Sparsity and statistical analysis of some learning algorithms|
|Other Titles: ||Xue xi suan fa de xi shu xing yu tong ji fen xi|
|Authors: ||Guo, Zhengchu ( 郭正初)|
|Department: ||Department of Mathematics|
|Degree: ||Doctor of Philosophy|
|Issue Date: ||2011|
|Publisher: ||City University of Hong Kong|
|Subjects: ||Machine learning.|
|Notes: ||CityU Call Number: Q325.5 .G87 2011|
vi, 107 leaves 30 cm.
Thesis (Ph.D.)--City University of Hong Kong, 2011.
Includes bibliographical references (leaves -107)
|Abstract: ||Learning algorithms aim at learning functions or function features from samples. In
this thesis, we investigate some learning algorithms for regression, classification and
some spectral algorithms in learning theory. Error analysis is conducted from approximation
theory viewpoints and sparsity of spectral algorithms is studied.
We first consider the least square regularization schemes in reproducing kernel
Hilbert spaces for regression problems with unbounded sampling. In the literature, it is
often assumed that output of the learning and sampling process is uniformly bounded.
However, this assumption is somewhat strong, it is not satisfied by the commonly
used Gaussian distribution in statistics. In this thesis, under some moment incremental
conditions we analyze the error with unbounded sampling via concentration estimation
based on ℓ2-empirical covering numbers. The best learning rate in the literature is
provided, even better than those in the bounded case in some situations. A kernel-based
online learning algorithm for regression with unbounded sampling processes is
studied. Under the moment incremental condition on the sampling output, we provide
a satisfactory confidence-based bound for the error in the corresponding reproducing
kernel Hilbert space.
Binary classification generated by Tikhonov regularization schemes in reproducing
kernel Hilbert spaces associated with general convex loss functions in a non-i.i.d.
setting is considered in the second part of the thesis. We abandon both the independence
and the identity of the sampling. We derive capacity dependent learning rates for the excess misclassification error via the ℓ2-empirical covering number. Our learning
rate is consistent to that of the i.i.d. setting.
Finally, we consider some spectral algorithms for regression in reproducing kernel
Hilbert spaces. When the filter function vanishes near the origin and the output function
is represented in terms of empirical features, the corresponding spectral algorithm
has sparsity. Sparsity and learning rates are obtained under a regularity assumption of
the regression function.|
|Online Catalog Link: ||http://lib.cityu.edu.hk/record=b4086784|
|Appears in Collections:||MA - Doctor of Philosophy |
Items in CityU IR are protected by copyright, with all rights reserved, unless otherwise indicated.