Full metadata record
DC FieldValueLanguage
dc.contributorMulti-disciplinary Studiesen_US
dc.contributorDepartment of Electronic Engineeringen_US
dc.creatorTung, Ching-hon-
dc.identifier.urihttps://theses.lib.polyu.edu.hk/handle/200/2062-
dc.languageEnglishen_US
dc.publisherHong Kong Polytechnic University-
dc.rightsAll rights reserveden_US
dc.titleA study of phoneme synthesis with neural networksen_US
dcterms.abstractThis report is mainly divided into two parts. The first part of the report is a preliminary study of the properties of Linear Prediction Coding (LPC)[1] and Artificial Neural Networks (ANN). Some phoneme units are digitised and recorded in the Sound Blaster (SB) voice files. They are converted to LPC and then back to the SB files again. The difference and distortion between the original and synthetic sound are investigated. Moreover, a multi-layer back-propagation network (MLP)[2] is constructed to map the encoded phonemes to the LPC with the order of 12. The function mapping and memory property of MLP are well demonstrated. The 2nd part of the report contains the main objective - A study of phoneme synthesis with neural networks. A real time recurrent learning network (RTRL)[3][4] is constructed and the capability to achieve the task is investigated. First, teacher forced learning method mentioned in [4] is used to train the network and a stable oscillation output can be achieved after training. The result shows that this learning method may not be able to achieve the task because the stable oscillation cannot follow the variation of the speech waveform. Phonemes can be simply divided into vowels and consonants. Vowels are voiced and periodic while the consonants are voiceless and non periodic. Since the network mentioned in [3] can be applied to model a 2nd order IIR lowpass filter and can be used for function prediction, it gives an idea that the network may be trained to behave like a filter which can generate the waveforms by inputting the information of the frequency components of the waveforms. The network is modified so that it can behave like a recursive filter to generate the waveforms by applying output feedback and some frequency components of the waveforms. Hence, some simple vowels are used to train the network and the results are promising. The network seems able to learn to generate the periodic waveforms such as the vowels but the synthesis of the non-periodic consonants are still remain unsolved.en_US
dcterms.extentii, 53 leaves : ill. (some col.) ; 30 cmen_US
dcterms.isPartOfPolyU Electronic Thesesen_US
dcterms.issued1995en_US
dcterms.educationalLevelAll Masteren_US
dcterms.educationalLevelM.Sc.en_US
dcterms.LCSHNeural networks (Computer science)en_US
dcterms.LCSHSpeech processing systemsen_US
dcterms.LCSHSpeech synthesisen_US
dcterms.LCSHHong Kong Polytechnic University -- Dissertationsen_US
dcterms.accessRightsrestricted accessen_US

Files in This Item:
File Description SizeFormat 
b15554168.pdfFor All Users (off-campus access for PolyU Staff & Students only)3.94 MBAdobe PDFView/Open


Copyright Undertaking

As a bona fide Library user, I declare that:

  1. I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
  2. I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
  3. I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.

By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.

Show simple item record

Please use this identifier to cite or link to this item: https://theses.lib.polyu.edu.hk/handle/200/2062