A fast scoring method for PLDA with uncertainty propagation

Lin, Wei-wei

Full metadata record

DC Field	Value	Language
dc.contributor	Department of Electronic and Information Engineering	en_US
dc.contributor.advisor	Mak, M. W. (EIE)	-
dc.creator	Lin, Wei-wei	-
dc.identifier.uri	https://theses.lib.polyu.edu.hk/handle/200/9445	-
dc.language	English	en_US
dc.publisher	Hong Kong Polytechnic University	-
dc.rights	All rights reserved	en_US
dc.title	A fast scoring method for PLDA with uncertainty propagation	en_US
dcterms.abstract	Speaker verification refers to the task of determining whether or not a claimant is the person he/she claims to be. In text-independent speaker verification, using i-vectors as low-dimensional feature representation and probabilistic linear discriminant analysis (PLDA) for session compensation and classification has achieved the state-of-the-art performance in many scenarios. However, the good performance of standard i-vector/PLDA framework relies on the condition that both the enrolment utterances and test utterances are sufficiently long for reliable estimation of i-vectors. In real applications, both enrolment and test utterances could be very short, resulting in erroneous i-vector estimation. Recently, an innovative approach to addressing the short-utterance problem in i-vector/PLDA framework has been proposed. By propagating the covariance of i-vectors into the PLDA model, this approach explicitly expresses uncertainty of i-vector extraction in the verification stage. The method is called Uncertainty Propagation (UP). It has showed superior performance over standard PLDA/i-vector framework in short-utterance scenarios. However, the method leads to session-dependent loading matrices in the PLDA model, which makes the verification process computationally expensive. Beside, the method also requires a large amount of memory for storing the covariance matrices of target speaker's i-vectors. A method to alleviate the computational burden and memory requirement of Uncertainty Propagation is imperative. This thesis proposes a method to speed up the verification process and to relax memory requirement in UP by building up a repository to store the length-dependent matrices. During verification, the proper length-dependent matrices are selected for scoring. Experiments on the NIST 2012 Speaker Recognition Evaluation show that the proposed method performs as good as the standard UP with only 3.7% of the scoring time and 37% of memory consumption that standard UP would take. Beside, with minor compromise on the performance (an increase of 0.35% in EER), the method can further reduce memory consumption to only 15% of standard UP.	en_US
dcterms.extent	vii, 57 pages : color illustrations	en_US
dcterms.isPartOf	PolyU Electronic Theses	en_US
dcterms.issued	2016	en_US
dcterms.educationalLevel	M.Sc.	en_US
dcterms.educationalLevel	All Master	en_US
dcterms.LCSH	Hong Kong Polytechnic University -- Dissertations	en_US
dcterms.LCSH	Automatic speech recognition	en_US
dcterms.accessRights	restricted access	en_US

Files in This Item:

File	Description	Size	Format
991022131147203411.pdf	For All Users (off-campus access for PolyU Staff & Students only)	786.87 kB	Adobe PDF	View/Open

Copyright Undertaking

As a bona fide Library user, I declare that:

I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.

By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.

Show simple item record

Please use this identifier to cite or link to this item: https://theses.lib.polyu.edu.hk/handle/200/9445