A famework for personal emotion categorization and visual attention estimation

Pao Yue-kong Library Electronic Theses Database

A famework for personal emotion categorization and visual attention estimation


Author: Huang, Xuelin
Title: A famework for personal emotion categorization and visual attention estimation
Degree: Ph.D.
Year: 2016
Subject: Human face recognition (Computer science)
Eye tracking.
Human-computer interaction.
Hong Kong Polytechnic University -- Dissertations
Department: Dept. of Computing
Pages: 1 online resource (xxvi, 153 pages) : color illustrations
xxvi, 153 pages : color illustrations
Language: English
InnoPac Record: http://library.polyu.edu.hk/record=b2925604
URI: http://theses.lib.polyu.edu.hk/handle/200/8756
Abstract: Visual signals, such as those obtained by observing facial expression and eye movements, are keys to understanding how humans think and feel. There has therefore been much previous work in facial expression analysis and eye gaze analysis, but the work is hampered by two main challenges: human behavior varies a lot, which makes it hard to generalize across multiple individuals; and data annotation is expensive, therefore it is very difficult to collect large amounts of data from which to generalize. In my thesis work, I address these challenges through a framework for personal emotion categorization and visual attention estimation. I establish several different approaches for constructing accurate user-dependent models, which are designed to address the challenge of personal differences in facial affect and visual attention estimation problems. My work focuses on a non-intrusive approach that would be suitable for in-situ contexts, without the need for specialized hardware.
For facial affect recognition, to feasibly acquire adequate target data and maximally alleviate the annotation effort for learning, I propose PADMA, an efficient association-based multiple-instance learning approach for facial affect recognition with coarse-grained annotations. I then proceed to empirically demonstrate that my proposed user-dependent models considerably outperform the state-of-the-art counterparts in facial affect recognition issues across different facial datasets. I then further extend my investigations to produce fast-PADMA, which addresses the effectiveness of two types of user-dependent models: the user-specific model that learns only from the target user's data, and the user-adaptive model that is trained on both the target and the source subjects. Each model has its own advantages. Given sufficient personal data, the user-specific model can fully accommodate the diverse aspects of the target user, including the facial geometry as well as the expression preference. The user-adaptive model, on the other hand, is able to adapt knowledge from a large number of source subjects, and thus requires relatively little target-specific data to achieve a satisfactory performance, which accelerates the learning process. Depending on the amount of target-specific data available for a particular context, we can select the most appropriate form of the user-dependent model. My findings, therefore, suggest that it is feasible to build a well-performing user-dependent facial affect model for a particular user with only a limited amount of coarse-grained annotations. For visual attention, I will use experiments to illustrate the correlation between eye gaze behavior and interactions in daily human-computer activities, such as mouse-click and keypress, which show that these correlations are dependent on the context as well as on user affect. I will then further demonstrate through PACE, which refines and adopts daily interaction-informed data for gaze learning in an implicit manner, without the need of user annotation nor intrusive calibration. Likewise, the coordination pattern between gaze movement and mouse-click is also indicative of the mental states, such as stress. The results show success in learning the visual attention location from the noisy interaction-informed data, and suggest promise in using gaze and click coordination pattern to infer stress level.

Files in this item

Files Size Format
b29256045.pdf 6.195Mb PDF
Copyright Undertaking
As a bona fide Library user, I declare that:
  1. I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
  2. I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
  3. I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.
By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.


Quick Search


More Information