Full metadata record
DC FieldValueLanguage
dc.contributorDepartment of Electronic and Information Engineeringen_US
dc.contributor.advisorMak, M. W. (EIE)-
dc.creatorLin, Wei-wei-
dc.identifier.urihttps://theses.lib.polyu.edu.hk/handle/200/9445-
dc.languageEnglishen_US
dc.publisherHong Kong Polytechnic University-
dc.rightsAll rights reserveden_US
dc.titleA fast scoring method for PLDA with uncertainty propagationen_US
dcterms.abstractSpeaker verification refers to the task of determining whether or not a claimant is the person he/she claims to be. In text-independent speaker verification, using i-vectors as low-dimensional feature representation and probabilistic linear discriminant analysis (PLDA) for session compensation and classification has achieved the state-of-the-art performance in many scenarios. However, the good performance of standard i-vector/PLDA framework relies on the condition that both the enrolment utterances and test utterances are sufficiently long for reliable estimation of i-vectors. In real applications, both enrolment and test utterances could be very short, resulting in erroneous i-vector estimation. Recently, an innovative approach to addressing the short-utterance problem in i-vector/PLDA framework has been proposed. By propagating the covariance of i-vectors into the PLDA model, this approach explicitly expresses uncertainty of i-vector extraction in the verification stage. The method is called Uncertainty Propagation (UP). It has showed superior performance over standard PLDA/i-vector framework in short-utterance scenarios. However, the method leads to session-dependent loading matrices in the PLDA model, which makes the verification process computationally expensive. Beside, the method also requires a large amount of memory for storing the covariance matrices of target speaker's i-vectors. A method to alleviate the computational burden and memory requirement of Uncertainty Propagation is imperative. This thesis proposes a method to speed up the verification process and to relax memory requirement in UP by building up a repository to store the length-dependent matrices. During verification, the proper length-dependent matrices are selected for scoring. Experiments on the NIST 2012 Speaker Recognition Evaluation show that the proposed method performs as good as the standard UP with only 3.7% of the scoring time and 37% of memory consumption that standard UP would take. Beside, with minor compromise on the performance (an increase of 0.35% in EER), the method can further reduce memory consumption to only 15% of standard UP.en_US
dcterms.extentvii, 57 pages : color illustrationsen_US
dcterms.isPartOfPolyU Electronic Thesesen_US
dcterms.issued2016en_US
dcterms.educationalLevelM.Sc.en_US
dcterms.educationalLevelAll Masteren_US
dcterms.LCSHHong Kong Polytechnic University -- Dissertationsen_US
dcterms.LCSHAutomatic speech recognitionen_US
dcterms.accessRightsrestricted accessen_US

Files in This Item:
File Description SizeFormat 
991022131147203411.pdfFor All Users (off-campus access for PolyU Staff & Students only)786.87 kBAdobe PDFView/Open


Copyright Undertaking

As a bona fide Library user, I declare that:

  1. I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
  2. I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
  3. I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.

By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.

Show simple item record

Please use this identifier to cite or link to this item: https://theses.lib.polyu.edu.hk/handle/200/9445