Full metadata record
DC FieldValueLanguage
dc.contributorDepartment of Electronic and Information Engineeringen_US
dc.contributor.advisorLun, P. K. Daniel (EIE)en_US
dc.contributor.advisorLam, K. M. Kenneth (EIE)en_US
dc.creatorZhao, Rui-
dc.identifier.urihttps://theses.lib.polyu.edu.hk/handle/200/11712-
dc.publisherHong Kong Polytechnic Universityen_US
dc.rightsAll rights reserveden_US
dc.titleMachine learning for facial-expression recognitionen_US
dcterms.abstractFacial expression recognition (FER) aims to identify the emotional state of humans through facial observations, which shows valuable perspectives in mental health monitoring, lie and pain detection, psychological diagnosis, etc. Owing to the great progress made on deep learning techniques on computer vision in the past few years, facial expression analysis, based on deep neural networks, has significantly boosted the FER performance in practical applications. In this thesis, we conduct an in-depth study on deep-learning­based FER and propose effective learning frameworks to facilitate FER reasoning with the expression generative prior and geometric knowledge.en_US
dcterms.abstractFirstly, we study the learning regularization in current deep FER frameworks. We show that the extension of locality-preserving algorithms can facilitate deep networks to learn meaningful feature manifold for more effective and generalizable FER. Therefore, we reformulate the objective function in soft locality-preserving mapping for subspace formation, and employ it as an additional penalty to supervise representation learning with deep neural networks. This regularization considers both the inter- and intra-class variations in a local region of the manifold, which improves the discriminability of the learned features, and consequently benefits the FER performance.en_US
dcterms.abstractSecondly, we analyze conditional facial expression generators, which can inherently abstract the semantic knowledge of different facial expressions. To introduce this generative knowledge into FER networks, we establish a multi-task learning framework, based on generative adversarial networks, to bridge the facial expression recognition and synthesis tasks in a selective information-sharing fashion. By this means, we eliminate the redundant and harmful features in task interactions and improve the FER accuracy with the beneficial semantic features embedded in the expression generator.en_US
dcterms.abstractThirdly, we show that the geometric knowledge behind facial observations is also highly related to facial expressions, which can be employed in FER systems to enhance performance. Therefore, we present a dual-stream framework, based on graph convolutional networks and convolutional neural networks, to extract more discriminative emotion representations from both the facial appearance and facial geometric modalities. With the geometric knowledge, the frameworks show better generalization and robustness.en_US
dcterms.abstractFinally, we extend the geometry-enhanced FER framework to the spatial-temporal domain for video-based expression recognition. To better capture the temporal correlation between the vertices in a facial graph, we further introduce transformers into the framework to learn the longer-range dependency between vertices in a facial sequence. With the self-attention mappings in the transformer, we build non-local feature interactions to more comprehensively describe the facial sequence. In the geometry-related frameworks, we have also investigated the attention mechanisms to guide the networks to concentrate on the most informative facial regions (components) and frames in the spatial and temporal domains.en_US
dcterms.abstractThe frameworks proposed in this thesis are evaluated through comparisons with other state-of-the-art methods on the commonly used FER benchmarks, including the posed databases and the in-the-wild databases. We also perform the cross-dataset evaluation and the robustness tests. Experimental results and ablative studies demonstrate that our proposed frameworks can achieve promising performance, and the proposed novel designs in the frameworks are beneficial to FER learning.en_US
dcterms.extent159 pages : color illustrationsen_US
dcterms.isPartOfPolyU Electronic Thesesen_US
dcterms.issued2022en_US
dcterms.educationalLevelPh.D.en_US
dcterms.educationalLevelAll Doctorateen_US
dcterms.LCSHFacial expression -- Data processingen_US
dcterms.LCSHMachine learningen_US
dcterms.LCSHHong Kong Polytechnic University -- Dissertationsen_US
dcterms.accessRightsopen accessen_US

Files in This Item:
File Description SizeFormat 
6241.pdfFor All Users23.58 MBAdobe PDFView/Open


Copyright Undertaking

As a bona fide Library user, I declare that:

  1. I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
  2. I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
  3. I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.

By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.

Show simple item record

Please use this identifier to cite or link to this item: https://theses.lib.polyu.edu.hk/handle/200/11712