Author: Yiu, Kwok-kwong Michael
Title: Speaker verification based on probabilistic neural networks with a priori decision thresholds
Degree: M.Phil.
Year: 2000
Subject: Speech processing systems
Automatic speech recognition
Neural networks (Computer science)
Hong Kong Polytechnic University -- Dissertations
Department: Department of Electronic and Information Engineering
Pages: ix, 115 leaves : ill. ; 30 cm
Language: English
Abstract: Speaker verification is to verify the identity of a speaker based on his or her own voice. Typically, a speaker verification system requires one or more decision thresholds for making verification decisions: accepting the users and rejecting impostors. For the purpose of comparing the performance of different systems, researchers usually adjust the thresholds during verification in order to equalise the false acceptance rate and the false rejection rate. However, in real-world environment, the thresholds should be determined prior to verification. In conventional approaches to speaker verification, a speaker model is constructed for each user, followed by a threshold determination procedure. While this two-step approach has been successful in many situations, it does not account for the interaction between the speaker models and the decision thresholds. In this dissertation, we integrate the speaker model construction and threshold determination procedures in a single framework by using probabilistic decision-based neural networks (PDBNNs). A PDBNN can be considered as a Gaussian mixture model (GMM) with trainable decision thresholds. GMMs have been widely used as speaker models because of their capability to model arbitrary density functions. However, GMMs have limitations as they do not provide a proper mechanism for setting decision thresholds. By using the thresholding mechanism of PDBNNs, this dissertation aims to improve the robustness of speaker verification systems against intruder attacks. This dissertation begins with detailed illustrations to compare the decision boundaries of PDBNNs with that of GMMs. The comparison is based on two pattern recognition tasks, namely the noisy XOR problem and the classification of two-dimensional vowel data. Experimental results show that the thresholding mechanism of PDBNNs is very effective in detecting data not belonging to any known classes. Based on this finding, the dissertation explains how the networks can be extended to speaker verification. Experimental evaluations based on 138 speakers of the YOHO corpus have been conducted. It is found that the error rate obtained by the PDBNNs is about half of that of Higgins et al. (a benchmark error rate fot the YOHO corpus), suggesting that the discriminative training procedure of PDBNNs is able to improve the robustness of the speaker models. It is also found that the discriminative training procedure of PDBNNs is able to embed the background speakers characteristics in the speaker models, resulting in a substantial saving in computational resources during verification. This work has also explored various channel compensation techniques for speaker verification over the public telephone network. A new channel compensation approach, which is based on the measurement of telephone handsets' frequency responses, is proposed. The capability of various channel compensation methods, such as cepstral mean subtraction and signal bias removal, in reducing channel distortion is compared with that of the proposed approach. Results show that the proposed approach outperforms the conventional cepstral mean subtraction but is slightly inferior to signal bias removal.
Rights: All rights reserved
Access: open access

Files in This Item:
File Description SizeFormat 
b15353898.pdfFor All Users3.58 MBAdobe PDFView/Open


Copyright Undertaking

As a bona fide Library user, I declare that:

  1. I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
  2. I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
  3. I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.

By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.

Show full item record

Please use this identifier to cite or link to this item: https://theses.lib.polyu.edu.hk/handle/200/5328