City University of Hong Kong

CityU Institutional Repository >
3_CityU Electronic Theses and Dissertations >
ETD - Dept. of Mathematics  >
MA - Doctor of Philosophy  >

Please use this identifier to cite or link to this item:

Title: Analysis of statistical learning algorithms in data dependent function spaces
Other Titles: Shu ju xiang guan han shu kong jian zhong tong ji xue xi suan fa de fen xi
Authors: Wang, Hongyan (王洪彥)
Department: Department of Mathematics
Degree: Doctor of Philosophy
Issue Date: 2009
Publisher: City University of Hong Kong
Subjects: Computational learning theory.
Approximation theory.
Function spaces.
Notes: CityU Call Number: Q325.7 .W36 2009
vi, 100 leaves 30 cm.
Thesis (Ph.D.)--City University of Hong Kong, 2009.
Includes bibliographical references (leaves [87]-100)
Type: thesis
Abstract: In this thesis we study some algorithms in statistical learning theory by methods from approximation theory. First we apply the moving least-square method to the regression problem. Moving least-square method is an approximation method for data smoothing, numerical analysis, statistics and some other purposes. It involves a weight function such as Gaussian weights and a finite dimensional space of real valued functions. In our setting the data points for the moving least-square algorithm are drawn from a probability distribution. We conduct error analysis for learning the regression function by imposing mild conditions on the marginal distribution and the hypothesis space. Then we consider a learning algorithm for regression with data dependent hypothesis space and `1-regularizer. The data dependent nature of the algorithm leads to an extra error term called hypothesis error, which is essentially different from regularization schemes with data independent hypothesis spaces. By dealing with regularization error, sample error and hypothesis error, we estimate the total error in terms of properties of the Mercer kernel, the input space, the marginal distribution and the regression function of the regression problem. Especially for the hypothesis error, we use some techniques of scattered data interpolation in multivariate approximation to improve the convergence rates. Better learning rates are derived by imposing high order regularities of the kernel and choosing suitable values of the regularization parameter. Finally a gradient descent algorithm of learning gradients is introduced in the framework of classification problems. Learning gradients is one approach for variable selection and feature covariation estimation when dealing with large data of many variables or coordinates. In the classification setting involving a convex loss function, a possible algorithm for gradient learning is implemented by solving convex quadratic programming optimization problems induced by regularization schemes in reproducing kernel Hilbert spaces. The complexity for such an algorithm might be very high when the number of variables or samples is huge. Our gradient descent algorithm is simple and its convergence is elegantly studied with learning rates explicitly presented. Deep analysis for approximation by reproducing kernel Hilbert spaces under some mild conditions on the probability measure for sampling allows us to deal with a general class of convex loss functions.
Online Catalog Link:
Appears in Collections:MA - Doctor of Philosophy

Files in This Item:

File Description SizeFormat
abstract.html132 BHTMLView/Open
fulltext.html132 BHTMLView/Open

Items in CityU IR are protected by copyright, with all rights reserved, unless otherwise indicated.


Valid XHTML 1.0!
DSpace Software © 2013 CityU Library - Send feedback to Library Systems
Privacy Policy · Copyright · Disclaimer