An empirical study of classifier fusion schemes for handwritten character recognition

Chan, Wha-san

Full metadata record

DC Field	Value	Language
dc.contributor	Multi-disciplinary Studies	en_US
dc.contributor	Department of Computing	en_US
dc.creator	Chan, Wha-san	-
dc.identifier.uri	https://theses.lib.polyu.edu.hk/handle/200/4724	-
dc.language	English	en_US
dc.publisher	Hong Kong Polytechnic University	-
dc.rights	All rights reserved	en_US
dc.title	An empirical study of classifier fusion schemes for handwritten character recognition	en_US
dcterms.abstract	In traditional Chinese Character Recognition (CCR) systems, single classifier operating on a single set of features is usually used. It aims at solving the problem with large class set and noisy inputs and hence it is very difficult to obtain a close-to-perfect single classifier. Besides, large feature set will also increase the complexity of the classifier. Classification performance can be improved by combining the decisions from different classifiers or classifiers on different set of feature vectors. Ideally, the combination algorithm should take advantages from individual decision. Different classifier fusion schemes are studied in the context of CCR. In this work, 186 Chinese character categories are used as data. There are totally ten samples in each category, seven is used for training and the remaining is used for testing. Each character image is preprocessed before feeding into the classifier. The preprocessing process includes smoothing, normalization and thinning. Furthermore, two resolution (32x32, 64x64) of processed images are used. The preprocessed images are then undergone the feature extraction step. Four feature vectors were extracted and they are directional feature [13], simple feature point [9,14,18], peripheral shape feature [15,16] and cross count [8,10,17] respectively. They can combine to form a full feature vector of 140 dimension. Alternatively, they can be grouped into three feature vectors randomly. Three popular classifiers are built. The first one is a Modified Learning Vector Quantization (MLVQ) [19,20] Neural Network. The maximum recognition rate obtained is 69.7% for the top candidate. The second classifier is a multi-layer perceptron with standard back-propagation learning algorithm [19]. In order to make the network converge for the training data, several techniques are employed. It includes scaling of input vector, shuffling of input data, scaling of initial weight [25] using Nguyen Widrow method and batch updates techniques [19]. The maximum recognition rate obtained is 65.5%. The third classifier is a minimum distance classifier based on Bayesian Decision [12]. The maximum recognition rate is 65.8%. Different combinations of results from each classifier (based on different features vectors) are passed to different fusion schemes for post classification. In this dissertation, four fusion schemes are tested. The first one is majority voting. The second method is by ranking where the top ten ranks from each classifier are used. From the results obtained, the top three candidates from each classifier are critical to the fusion performance. The third method is Bayesian formalism and the last method used is confidence level aggregation. Among the four algorithms, the ranking method and linear confidence aggregation out-perform the other two. When same type of classifier is fused, majority voting has a maximum recognition rate of 74.3%. The Bayesian formalism obtained a maximum rate of 74.7%. And the ranking method and linear confidence aggregation method have similar results. The highest recognition rate obtained is 77.6% and 78.0% for the two fusion algorithms. In the dissertation, modification has been made to combine Ranking and Bayesian Formalism. The maximum recognition rate can be increased to 81.4%. At last, experiment is done to combine results from different type of classifiers together, highest recognition rate obtained is 84.9%.	en_US
dcterms.extent	v, 71, [73] p. : ill. ; 30 cm	en_US
dcterms.isPartOf	PolyU Electronic Theses	en_US
dcterms.issued	2000	en_US
dcterms.educationalLevel	All Master	en_US
dcterms.educationalLevel	M.Sc.	en_US
dcterms.LCSH	Chinese character sets (Data processing)	en_US
dcterms.LCSH	Optical character recognition devices	en_US
dcterms.LCSH	Hong Kong Polytechnic University -- Dissertations	en_US
dcterms.accessRights	restricted access	en_US

Files in This Item:

File	Description	Size	Format
b15321393.pdf	For All Users (off-campus access for PolyU Staff & Students only)	3.94 MB	Adobe PDF	View/Open

Copyright Undertaking

As a bona fide Library user, I declare that:

I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.

By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.

Show simple item record

Please use this identifier to cite or link to this item: https://theses.lib.polyu.edu.hk/handle/200/4724