An empirical study of classifier fusion schemes for handwritten character recognition

Pao Yue-kong Library Electronic Theses Database

An empirical study of classifier fusion schemes for handwritten character recognition

 

Author: Chan, Wha-san
Title: An empirical study of classifier fusion schemes for handwritten character recognition
Degree: M.Sc.
Year: 2000
Subject: Chinese character sets (Data processing)
Optical character recognition devices
Hong Kong Polytechnic University -- Dissertations
Department: Multi-disciplinary Studies
Dept. of Computing
Pages: v, 71, [73] p. : ill. ; 30 cm
Language: English
InnoPac Record: http://library.polyu.edu.hk/record=b1532139
URI: http://theses.lib.polyu.edu.hk/handle/200/4724
Abstract: In traditional Chinese Character Recognition (CCR) systems, single classifier operating on a single set of features is usually used. It aims at solving the problem with large class set and noisy inputs and hence it is very difficult to obtain a close-to-perfect single classifier. Besides, large feature set will also increase the complexity of the classifier. Classification performance can be improved by combining the decisions from different classifiers or classifiers on different set of feature vectors. Ideally, the combination algorithm should take advantages from individual decision. Different classifier fusion schemes are studied in the context of CCR. In this work, 186 Chinese character categories are used as data. There are totally ten samples in each category, seven is used for training and the remaining is used for testing. Each character image is preprocessed before feeding into the classifier. The preprocessing process includes smoothing, normalization and thinning. Furthermore, two resolution (32x32, 64x64) of processed images are used. The preprocessed images are then undergone the feature extraction step. Four feature vectors were extracted and they are directional feature [13], simple feature point [9,14,18], peripheral shape feature [15,16] and cross count [8,10,17] respectively. They can combine to form a full feature vector of 140 dimension. Alternatively, they can be grouped into three feature vectors randomly. Three popular classifiers are built. The first one is a Modified Learning Vector Quantization (MLVQ) [19,20] Neural Network. The maximum recognition rate obtained is 69.7% for the top candidate. The second classifier is a multi-layer perceptron with standard back-propagation learning algorithm [19]. In order to make the network converge for the training data, several techniques are employed. It includes scaling of input vector, shuffling of input data, scaling of initial weight [25] using Nguyen Widrow method and batch updates techniques [19]. The maximum recognition rate obtained is 65.5%. The third classifier is a minimum distance classifier based on Bayesian Decision [12]. The maximum recognition rate is 65.8%. Different combinations of results from each classifier (based on different features vectors) are passed to different fusion schemes for post classification. In this dissertation, four fusion schemes are tested. The first one is majority voting. The second method is by ranking where the top ten ranks from each classifier are used. From the results obtained, the top three candidates from each classifier are critical to the fusion performance. The third method is Bayesian formalism and the last method used is confidence level aggregation. Among the four algorithms, the ranking method and linear confidence aggregation out-perform the other two. When same type of classifier is fused, majority voting has a maximum recognition rate of 74.3%. The Bayesian formalism obtained a maximum rate of 74.7%. And the ranking method and linear confidence aggregation method have similar results. The highest recognition rate obtained is 77.6% and 78.0% for the two fusion algorithms. In the dissertation, modification has been made to combine Ranking and Bayesian Formalism. The maximum recognition rate can be increased to 81.4%. At last, experiment is done to combine results from different type of classifiers together, highest recognition rate obtained is 84.9%.

Files in this item

Files Size Format
b15321393.pdf 4.037Mb PDF
Copyright Undertaking
As a bona fide Library user, I declare that:
  1. I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
  2. I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
  3. I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.
By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.

     

Quick Search

Browse

More Information