|
|
CityU Institutional Repository >
CityU Electronic Theses and Dissertations >
ETD - Dept. of Computer Science >
CS - Doctor of Philosophy >
Please use this identifier to cite or link to this item:
http://hdl.handle.net/2031/5448
|
| Title: | Approaches to symbol recognition and spotting |
| Other Titles: | Fu hao shi bie he ding wei suan fa 符號識別和定位算法 |
| Authors: | Zhang, Wan (張琬) |
| Department: | Department of Computer Science |
| Degree: | Doctor of Philosophy |
| Issue Date: | 2008 |
| Publisher: | City University of Hong Kong |
| Subjects: | Image processing -- Digital techniques. Computer graphics. Pattern recognition systems. Image analysis. |
| Notes: | CityU Call Number: TA1637 .Z4295 2008 xiii, 116 leaves : ill. 30 cm. Thesis (Ph.D.)--City University of Hong Kong, 2008. Includes bibliographical references (leaves 103-114) |
| Type: | thesis |
| Abstract: | The problem of symbol recognition and symbol spotting have been addressed
in this thesis. First of all we reviewed the literature of the
symbol recognition problem from three aspects: 1) how to represent
a symbol, 2) how to matching the symbols based on their representations,
and 3) how to solve the symbol alignment problem during symbol
matching. As the main contributions, we then propose two different
approaches for symbol recognition and symbol spotting.
The first contribution is a statistical approach to symbol recognition.
In the approach, a symbol is measured and represented in
2-dimensional kernel densities around the sampling points on its skeleton.
The Kullback-Leibler (KL) divergence is then exploited to measure
the similarity between the densities of two symbols. Compared
to the results of the same test sets reported in (Yang, 2005), the best
overall results in the 2003 GREC Contest, our method achieves the
better overall performance except in two subsets with combination of
deformation and degradation. On the hand-drawn drawings generated
with the 50 models also from the 2003 GREC Contest, our recognition
results are 96% and 94% for the two generated sets respectively. It is much better than those of Su’s method, which are 72% and 74% correspondingly.
We propose two methods to eliminate the rotation effect.
One is to adjust the rotation angel by minimizing the KL divergence
between the test symbol and the model symbol with the gradient-based
method. The other is to exploit the independent component analysis
(ICA) technique, which considers the higher-order statistical information
of the data and whose outputs are generally invariant to any invertible
linear transformation. The first method gives quite accurate
results, as shown in the experiment section, and the second method
reduces the computation greatly with the performance degraded just
around 5%. By introducing ICA, it only takes about 80ms to match a
pair of symbols, which is only 1
30 of that for the gradient-based algorithm.
The second approach is a hybrid approach to symbol recognition
or spotting of symbols in vectorial forms. We calculate all primitivepair
relationships in a symbol and generate the signature of the symbol
with those relationships. Finally, the symbol is represented as a feature
set containing all the primitive-pair relationships. Matching between
two symbols/drawings is then reduced to the problem of matching two
feature sets. We apply the approach on the GREC2003 test sets with
symbols in vectorial forms. Since all the best matching models for the
test symbols are correctly retrieved, the recognition accuracies are all
100%. Note that, the retrieved most similar models for a test symbol
may not be unique. If the recognition standard is that the recognition
result must be unique, the accuracy of the approach drops. However,
all of the recognition accuracy are still above 92%. For this database,
at most two best matching models are retrieved for each test symbol.
Under such situation, the approach can be applied to filter out the impossible similar symbols and further accurate recognition can be
performed based on those results. The approach preserves high recognition
accuracy, while speeding up the matching procedure. Since all
the models can be processed in advance, the important factor to reveal
the efficiency of the approach in the symbol recognition task is
the average time to create a signature for a test symbol and the time
for matching a pair of symbol. Averagely, a signature of a symbol
in the proposed approach can be created in 0.025 seconds. Moreover,
matching a pair of symbols only costs 0.01 seconds. Compared with the
time cost used in the proposed statistical method, which costs about
2.4 seconds for matching a pair of symbols with rotation estimation,
the proposed approach improves the matching speed by two hundred
times.
Moreover, we generate a general framework for the performance
evaluation of certain recognition approaches. All the preprocessing
steps are the same for the to-be-evaluated approaches. Both symbol
databases and shape databases are applied to do evaluation. In symbol
databases, a symbol is unique and people target to find the exact
matching symbol for the testing symbol. However, shape databases
usually include shapes in various categories or classes and people aim
at finding the characteristics of a class and retrieving the top similar
shapes for the testing symbol. Different traditional measures, e.g.
recognition accuracy and top matching rates, are applied to evaluate
the performance of the methods. Generally, the symbol recognition approaches
perform well on the symbol databases and have strong competitive
performance in retrieving the top similar shapes. Furthermore,
new measures, namely homogeneity and separability, are proposed to
explore more characteristics of the methods so that they can be better understood. High homogeneity means that a descriptor can represent
the symbols in the same class in a highly similar way. High separability
means that a descriptor can distinct symbols in different classes
well. These two measures are expected to evaluate how well distributed
are the symbols in the space of representation provided by the symbol
descriptor. Experimental results show that the proposed statistical
method in Chapter 3 describes a relatively sparse inner-class structure,
that if we regard the representations of the shapes in one class as points
in one class, sparse inner-class structure means that their distribution
is relatively sparse. It is also an explanation to the question why the
proposed statistical method is robust to noisy and distorted symbols. |
| Online Catalog Link: | http://lib.cityu.edu.hk/record=b2340589 |
| Appears in Collections: | CS - Doctor of Philosophy
|
Items in CityU IR are protected by copyright, with all rights reserved, unless otherwise indicated.
|