Neural network techniques for graffiti interpretation and speech recognition

Leung, Koon-fai

Author:	Leung, Koon-fai
Title:	Neural network techniques for graffiti interpretation and speech recognition
Degree:	M.Phil.
Year:	2004
Subject:	Hong Kong Polytechnic University -- Dissertations Neural networks (Computer science) Electronic books Pattern perception Speech perception
Department:	Department of Electronic and Information Engineering
Pages:	xix, 121 leaves : ill. ; 30 cm
Language:	English
Abstract:	This thesis explores the neural network classification techniques on an electronic book (eBook) reading device. Two areas of application are addressed: a graffiti interpreter and a Cantonese-speech recognizer. Different structures of neural networks and hybrid neural networks incorporating fuzzy sets are used to realize the applications. An eBook reading device enhances our reading environment with interactive and multimedia features. Input for this device is possibly made using a stylus on a touch-screen or voice through a microphone; practically, the former is a pattern recognition (graffiti interpretation) problem and the latter is a speech recognition problem. With graffiti interpretation, eBook users can take full advantage of the graffiti input to issue commands or input texts. The interpretation is done by the template matching technique. Two approaches are developed to realize the pattern recognition, which apply a self-structured neural network and a self-structure neural-fuzzy network. Improved from a 3-layer fully connected neural network/neural-fuzzy network, the self-structured network has a variable structure that adapts to the characteristics of the input patterns by incorporating link switches. By properly determining the states of the link-switches through training, the dummy links can be eliminated. Simulation results show that the self-structure network performs better than a fixed-structure network in terms of the network size. With a speech recognizer, eBook users can use natural speech to execute some functions of the eBook and enter characters whenever necessary. Four approaches are proposed to recognize Cantonese speech. Of them, three are feed-forward neural networks, and one is a recurrent neural network. As the first approach, the self-structured neural-fuzzy network used for graffiti interpretation is also applied to recognize Cantonese-speech commands. Then, a neural-fuzzy network and a neural network are modified by adding associative memory to provide the network parameters. In both of these approaches, the neural-fuzzy network/neural network effectively has variable parameters that change with respect to the input patterns. Thus, the leaning ability can be enhanced for the case if two feature vectors belong to the same class but sparsely distributed. Results will be given to demonstrate the improvement on recognition accuracy, network complexity and learning rate. A discussion on comparing the various approaches will also be given. By using a recurrent neural network, the sequential properties of the double-syllable Cantonese-digit can be modeled. The fourth approach therefore involves an associative memory for a recurrent neural network. Results will be given to demonstrate the merits of the proposed approach. A discussion on the comparison between the static approaches and the dynamic approach will also be given. In this thesis, all neural networks are trained by an improved genetic algorithm (GA). The details about this algorithm and its performance in some benchmark test functions will be given in the Appendix.
Rights:	All rights reserved
Access:	open access

Files in This Item:

File	Description	Size	Format
b17726621.pdf	For All Users	3.73 MB	Adobe PDF	View/Open

Copyright Undertaking

As a bona fide Library user, I declare that:

I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.

By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.

Show full item record

Please use this identifier to cite or link to this item: https://theses.lib.polyu.edu.hk/handle/200/3053