Self-supervised features for speech emotion recognition

Xia, Junwei

Author:	Xia, Junwei
Title:	Self-supervised features for speech emotion recognition
Degree:	M.Sc.
Year:	2024
Department:	Department of Electrical and Electronic Engineering
Pages:	ii, 21 pages : color illustrations
Language:	English
Abstract:	Speech Emotion Recognition (SER) is a essential aspect of human-computer interaction, yet it poses significant challenges because of the complexity and subtlety of emotional expressions in speech. Traditional SER approaches relying on hand-crafted features often fail to capture these complexity. With the development of self-supervised learning models including Wav2Vec 2.0, there is an opportunity to leverage rich speech representations for SER. Nonetheless, full fine-tuning of large pre-trained models incurs great computational expenses and is at risk of overfitting, particularly when labelled data is scarce. This study investigates parameter-efficient fine-tuning techniques by incorporating prompt embeddings and adapter modules into a frozen pre-trained Wav2Vec 2.0 model. We conduct extensive experiments on the IEMOCAP dataset, comparing conventional fine-tuning methods and our proposed approach. Our results show that the proposed model can achieves unweighted accuracy 69.67% comparable to fine-tuning method with trainable parameters reduced by approximately 87%. The combined use of prompt tuning and adapters allows the proposed model to adapt effectively to the SER task with lower computational cost, offering a practical solution for real-world applications where resources may be limited.
Rights:	All rights reserved
Access:	restricted access

Files in This Item:

File	Description	Size	Format
8298.pdf	For All Users (off-campus access for PolyU Staff & Students only)	305.57 kB	Adobe PDF	View/Open

Copyright Undertaking

As a bona fide Library user, I declare that:

I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.

By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.

Show full item record

Please use this identifier to cite or link to this item: https://theses.lib.polyu.edu.hk/handle/200/13891