City University of Hong Kong

CityU Institutional Repository >
3_CityU Electronic Theses and Dissertations >
ETD - Dept. of Electronic Engineering  >
EE - Doctor of Philosophy  >

Please use this identifier to cite or link to this item:

Title: New improvements of self-organizing maps on growing structures, probabilistic formation, clustering and visualization
Other Titles: New improvements of self-organizing map on growing structures, probabilistic formation, clustering and visualization
Zi zu zhi shen jing wang luo zai wang luo jie gou zeng zhang, gai lu xing cheng ji zhi, ju lei ji ke shi hua shang de xin jin zhan
自組織神經網絡在網絡結構増長, 概率形成機制, 聚類及可視化上的新進展
Authors: Wu, Sitao (伍思濤)
Department: Dept. of Electronic Engineering
Degree: Doctor of Philosophy
Issue Date: 2004
Publisher: City University of Hong Kong
Subjects: Neural networks (Computer science)
Self-organizing systems
Notes: CityU Call Number: QA76.87.W82 2004
Includes bibliographical references (leaves 179-188)
Thesis (Ph.D.)--City University of Hong Kong, 2004
xiv, 189 leaves : ill. ; 30 cm.
Type: Thesis
Abstract: Self-Organizing Map (SOM) is an unsupervised neural network widely used in industrial areas such as pattern recognition, biological modeling, data compression , signal processing, and data mining. It is also a computational mapping principle that generates an ordered low-dimensional map from high-dimensional input data, i.e., a nonlinear projection or dimension reduction. Traditional SOM uses a predefined fixed network structure. Usually, one must adopt a number of trial tests to select an appropriate network structure and size for a given problem. Apparently, this is not the most flexible way in dealing with different types of data. In this thesis, two types of growing SOM, both of which are capable of adaptively increasing their network structures, are proposed. As a result, they can overcome the shortcomings of the fixed structure and dynamically grow the structure until a suitable size is reached. SOM, in fact, is a heuristic approach and has not been derived from rigorous mathematics. In this thesis, two types of probabilistic SOM, which can be derived from rigorous mathematics, are proposed. They are associated with certain cost functions and the sequential updating rules are just the results of their optimization. Unlike he hard assignment used in SOM, the two neural networks use the soft assignment, i.e., each input datum is assigned to each neuron with some probability. SOM can also be used I clustering, but SOM cannot e considered directly as a clustering algorithm. If it is used in clustering two-level clustering methods are usually used. i.e., clustering of neurons of SOM after completion of SOM. But there is still a problem about the most reasonable number of clusters that should be predefined. In this thesis, a two-level SOM-based clustering is proposed. It can cluster data and automatically determine the most reasonable number of clusters. Visualization can be used in data mining and can provide useful information for a better understanding of data. Through visualization one can evaluate the mined patterns and finally identify some interesting or useful patterns. SOM can also be used for visualization because it is a nonlinear projection algorithm. In this thesis, tree types of visualization methods based on SOM are proposed. They are all completely novel compared with the traditional SOM-based visualization methods. Based on the proposed visualization methods, interesting patterns of clusters can be found. Finally. Three application examples based on the proposed improvements of SOM are included in this thesis. The first application is a neural-network-based induction machine fault detection system. In this industrial application, it uses a growing SOM algorithm to determine the number of hidden neurons. The second application included is a neural-network-based content-based image retrieval system. In this application, another type of the proposed growing SOM algorithm is used to handle large and dynamic image sets. The last application is about clustering of gene expression data. The proposed two-level SOM-based clustering algorithm is used to determine the optimal clusters of gene expression data.
Online Catalog Link:
Appears in Collections:EE - Doctor of Philosophy

Files in This Item:

File Description SizeFormat
fulltext.html157 BHTMLView/Open
abstract.html157 BHTMLView/Open

Items in CityU IR are protected by copyright, with all rights reserved, unless otherwise indicated.


Valid XHTML 1.0!
DSpace Software © 2013 CityU Library - Send feedback to Library Systems
Privacy Policy · Copyright · Disclaimer