CityU Institutional Repository >
CityU Electronic Theses and Dissertations >
ETD - Dept. of Mathematics >
MA - Doctor of Philosophy >
Please use this identifier to cite or link to this item:
|Title: ||Analysis of statistical learning algorithms in data dependent function spaces|
|Other Titles: ||Shu ju xiang guan han shu kong jian zhong tong ji xue xi suan fa de fen xi|
|Authors: ||Wang, Hongyan (王洪彥)|
|Department: ||Department of Mathematics|
|Degree: ||Doctor of Philosophy|
|Issue Date: ||2009|
|Publisher: ||City University of Hong Kong|
|Subjects: ||Computational learning theory.|
|Notes: ||CityU Call Number: Q325.7 .W36 2009|
vi, 100 leaves 30 cm.
Thesis (Ph.D.)--City University of Hong Kong, 2009.
Includes bibliographical references (leaves -100)
|Abstract: ||In this thesis we study some algorithms in statistical learning theory by methods from
First we apply the moving least-square method to the regression problem. Moving
least-square method is an approximation method for data smoothing, numerical analysis,
statistics and some other purposes. It involves a weight function such as Gaussian
weights and a finite dimensional space of real valued functions. In our setting the data
points for the moving least-square algorithm are drawn from a probability distribution.
We conduct error analysis for learning the regression function by imposing mild
conditions on the marginal distribution and the hypothesis space.
Then we consider a learning algorithm for regression with data dependent hypothesis
space and `1-regularizer. The data dependent nature of the algorithm leads to an
extra error term called hypothesis error, which is essentially different from regularization
schemes with data independent hypothesis spaces. By dealing with regularization
error, sample error and hypothesis error, we estimate the total error in terms of properties
of the Mercer kernel, the input space, the marginal distribution and the regression
function of the regression problem. Especially for the hypothesis error, we use some
techniques of scattered data interpolation in multivariate approximation to improve the
convergence rates. Better learning rates are derived by imposing high order regularities
of the kernel and choosing suitable values of the regularization parameter.
Finally a gradient descent algorithm of learning gradients is introduced in the
framework of classification problems. Learning gradients is one approach for variable
selection and feature covariation estimation when dealing with large data of many
variables or coordinates. In the classification setting involving a convex loss function, a possible algorithm for gradient learning is implemented by solving convex quadratic
programming optimization problems induced by regularization schemes in reproducing
kernel Hilbert spaces. The complexity for such an algorithm might be very high
when the number of variables or samples is huge. Our gradient descent algorithm
is simple and its convergence is elegantly studied with learning rates explicitly presented.
Deep analysis for approximation by reproducing kernel Hilbert spaces under
some mild conditions on the probability measure for sampling allows us to deal with a
general class of convex loss functions.|
|Online Catalog Link: ||http://lib.cityu.edu.hk/record=b2375053|
|Appears in Collections:||MA - Doctor of Philosophy |
Items in CityU IR are protected by copyright, with all rights reserved, unless otherwise indicated.