Understanding human comprehension and attention in reading

Li, Jiajia

Author:	Li, Jiajia
Title:	Understanding human comprehension and attention in reading
Advisors:	Ngai, Grace (COMP) Chan, C. F. Stephen (COMP)
Degree:	Ph.D.
Year:	2017
Subject:	Hong Kong Polytechnic University -- Dissertations Reading -- Physiological aspects Attention -- Physiological aspects Eye -- Movements Human-computer interaction
Department:	Department of Computing
Pages:	xvi, 119 pages : color illustrations
Language:	English
Abstract:	Reading is one of the most common computer interaction activities and also one of the most fundamental means of knowledge acquisition. With the development of computing technologies and the growing popularity of e-Learning platforms, understanding human attention and comprehension through reading behaviors has the potential to become an important means to enhance the learning experience and effectiveness. Eye gaze pattern is known to play an important role in the study of reading behaviors since reading can be considered as a task where visual processing and sensorimotor control takes place in a highly structured visual environment [79]. Many studies have shown that eye movement and eye behavior during reading is closely related to cognitive human mental states, such as comprehension and attention [81][88][89]. There are two main drawbacks in current state-of-the-art research on comprehension and attention detection based on eye gaze patterns. First, many of them use expensive and intrusive devices, like the electrooculography systems, to track the eye movement, or detect the user's mental state as ground truth, through the use of electroencephalography (EEG) devices. Second, numerous methods study how lexical and linguistic variables affect the eye gaze behavior during reading. These methods therefore rely on the availability of linguistic analysis of the reading materials. Addressing the limitations mentioned above, we conduct experiments with human subjects and do an in-depth study of eye gaze patterns related to the change of comprehension level and attention level during reading. Both Tobii eye tracker and off-the-shelf webcam are used to capture the eye gaze signals based on which the eye gaze features are extracted. By adopting machine learning algorithms, we conduct feature evaluation and compare the classification performance with different kinds of eye gaze features. From the investigation, we have a better understanding of relation between the studied human mental states, i.e. comprehension and attention, and certain eye gaze patterns. We also find that the features extracted based on accurate eye gaze location on the screen captured by Tobii eye tracker contribute more to the comprehension and attention level detection during reading. In order to recognize human mental states, input signals reflecting human mental states need to be acquired and processed. Under traditional KVM (keyboard-video-mouse) settings, input signals are mostly tied to keyboard and mouse dynamics. One can deduce some information about human mental states and affects from keyboard [12][111] and mouse [110][123], but the accuracy is not particularly high. Thanks to the popularity of interactive social networking applications, the webcam has become a de facto device. Recent research in video processing and machine learning has demonstrated that human affects can be recognized via webcam video, noticeably via human facial features [127]. Inspired by previous research, we look into other modalities, i.e. facial expressions and mouse dynamics, for attention detection during reading. A two-level facial feature extraction approach is proposed to represent the static and dynamic states of the facial expressions of the subjects. Similarly, the mouse dynamic features are extracted from the captured log mouse events and evaluated for reading attention detection. To evaluate our method, we apply machine learning techniques to build up user-independent models to recognize human attention and comprehension level on reading tasks. We compare the performances of models built on single modality and multiple modalities. The findings suggest that the multimodal approach outperforms the unimodal approach in our studies. The results also demonstrate that eye gaze pattern and facial expressions show more potential in predicting attention level than the mouse dynamics, which may be caused by the rare usage of mouse as an input device in the reading task.
Rights:	All rights reserved
Access:	open access

Files in This Item:

File	Description	Size	Format
991021965756703411.pdf	For All Users	2.82 MB	Adobe PDF	View/Open

Copyright Undertaking

As a bona fide Library user, I declare that:

I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.

By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.

Show full item record

Please use this identifier to cite or link to this item: https://theses.lib.polyu.edu.hk/handle/200/9133