Author: Ouyang, Haowen
Title: Multi-source domain adaptation via domain adversarial neural networks for xvector-based speaker recognition
Advisors: Mak, M. W. (EIE)
Degree: M.Sc.
Year: 2021
Subject: Automatic speech recognition
Speech processing systems
Hong Kong Polytechnic University -- Dissertations
Department: Department of Electronic and Information Engineering
Pages: ix, 40 pages : color illustrations
Language: English
Abstract: The x-vector/PLDA framework has achieved state-of-the-art performance when the training and test data come from the same domain. However, verifcation error remains high when the x-vectors of the target speakers and the test speakers are extracted from different domains. This dissertation uses domain adversarial training (DAT) and domain adversarial neural networks (DANNs) to produce domain-invariant speaker embeddings from domain-mismatched x-vectors while maintaining the speaker-discriminative nature of the x-vectors. Conventional DANNs use DAT to optimize a feature extractor to produce domain-invariant features that confuse a binary domain classifer, where the latter aims to determine whether the input utterance comes from the source domain or the target domain. The proposed DANN achieves multi-source DAT by modifying domain classifer from binary to multi-class classifcation. Experimental results on NIST 2016 and 2018 SRE show that the proposed DANN can produce speaker embeddings that achieve the lowest equal error rate compared to the conventional x-vectors.
Rights: All rights reserved
Access: restricted access

Files in This Item:
File Description SizeFormat 
5661.pdfFor All Users (off-campus access for PolyU Staff & Students only)2.04 MBAdobe PDFView/Open

