Full metadata record
DC FieldValueLanguage
dc.contributorMulti-disciplinary Studiesen_US
dc.creatorKo, Tat Leung-
dc.identifier.urihttps://theses.lib.polyu.edu.hk/handle/200/4164-
dc.languageEnglishen_US
dc.publisherHong Kong Polytechnic University-
dc.rightsAll rights reserveden_US
dc.titleInvestigation of spatio-temporal networks for temporal sequence recognitionen_US
dcterms.abstractThis investigation is to verify the effectiveness of applying spatio-temporal approach in the area of temporal sequence recognition, in particular, speech recognition. Speech recognition is fundamentally a pattern classification task. Its objective is to take an input pattern, the speech waveform, and classify it as one of a set of spoken words, phrases, or sentences. The Spatio-temporal Pattern (STP) can be defined as a time-correlated sequence of spatio patterns. With spatio-temporal network, the speech data of a single word can be allocated into a series of time frames. Each time, the data in a specified time frame will be sent to the network for processing. This can reduce the complexity of the network and the processing time. One of the advantages of spatio-temporal network is that the network can be constructed dynamically, thus simulating the effect of training. One of its disadvantages is that different words with same ending may result in ambiguous results. In addition, Recurrent Neural Networks (RNNs) were applied to recognize the speech data by using the Real Time Recurrent Learning (RTRL) algorithm which is a gradient following learning algorithm for completely recurrent networks running in continuous sampled time. The merit of RTRL algorithm is its ability to process input data continuously without any requirement for a fixed, or even unbounded epoch length. Its drawback is that it requires a great deal of computation on each update cycle, and it is non local. From the results obtained, for both algorithms, the accuracy of recognition is about 75%. We have found that, the spatio-temporal networks are more suitable for speech recognition for the same speaker, whilst RTRL algorithm is more appropriate for speech recognition for multiple speakers. In order to increase the accuracy, for spatio-temporal networks, we found that the number of time frames for each word should be more or less the same. For the RTRL algorithm, it is better to use minimum squared errors instead of slope differences to determine how close the network output curve is to the desired output curve.en_US
dcterms.extentvi, 82 leaves : ill. ; 30 cmen_US
dcterms.isPartOfPolyU Electronic Thesesen_US
dcterms.issued1996en_US
dcterms.educationalLevelAll Masteren_US
dcterms.educationalLevelM.Sc.en_US
dcterms.LCSHAutomatic speech recognitionen_US
dcterms.LCSHNeural networks (Computer science)en_US
dcterms.LCSHHong Kong Polytechnic University -- Dissertationsen_US
dcterms.accessRightsrestricted accessen_US

Files in This Item:
File Description SizeFormat 
b12197993.pdfFor All Users (off-campus access for PolyU Staff & Students only)2.61 MBAdobe PDFView/Open


Copyright Undertaking

As a bona fide Library user, I declare that:

  1. I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
  2. I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
  3. I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.

By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.

Show simple item record

Please use this identifier to cite or link to this item: https://theses.lib.polyu.edu.hk/handle/200/4164