Author: | Ouyang, Haowen |
Title: | Multi-source domain adaptation via domain adversarial neural networks for xvector-based speaker recognition |
Advisors: | Mak, M. W. (EIE) |
Degree: | M.Sc. |
Year: | 2021 |
Subject: | Automatic speech recognition Speech processing systems Hong Kong Polytechnic University -- Dissertations |
Department: | Department of Electronic and Information Engineering |
Pages: | ix, 40 pages : color illustrations |
Language: | English |
Abstract: | The x-vector/PLDA framework has achieved state-of-the-art performance when the training and test data come from the same domain. However, verifcation error remains high when the x-vectors of the target speakers and the test speakers are extracted from different domains. This dissertation uses domain adversarial training (DAT) and domain adversarial neural networks (DANNs) to produce domain-invariant speaker embeddings from domain-mismatched x-vectors while maintaining the speaker-discriminative nature of the x-vectors. Conventional DANNs use DAT to optimize a feature extractor to produce domain-invariant features that confuse a binary domain classifer, where the latter aims to determine whether the input utterance comes from the source domain or the target domain. The proposed DANN achieves multi-source DAT by modifying domain classifer from binary to multi-class classifcation. Experimental results on NIST 2016 and 2018 SRE show that the proposed DANN can produce speaker embeddings that achieve the lowest equal error rate compared to the conventional x-vectors. |
Rights: | All rights reserved |
Access: | restricted access |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
5661.pdf | For All Users (off-campus access for PolyU Staff & Students only) | 2.04 MB | Adobe PDF | View/Open |
Copyright Undertaking
As a bona fide Library user, I declare that:
- I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
- I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
- I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.
By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.
Please use this identifier to cite or link to this item:
https://theses.lib.polyu.edu.hk/handle/200/11189