Author: Zhu, Linnan
Title: Learning a lightweight convolutional neural network for visual tracking and facial attribute analysis
Advisors: Zhang, Lei (COMP)
Degree: M.Phil.
Year: 2017
Subject: Hong Kong Polytechnic University -- Dissertations
Human face recognition (Computer science)
Image processing -- Digital techniques
Image analysis -- Data processing
Department: Department of Computing
Pages: xiv, 79 pages : color illustrations
Language: English
Abstract: In this thesis, we study the problems of object tracking and facial attribute analysis, in particular age and gender recognition. For object tracking, recently CNN based trackers have been proposed to improve tracking performance. Despite achieving state-of-the-art performance, existing CNN trackers still have many drawbacks. 1) Most of these methods utilize two separated CNNs for each input, while this strategy will increase much the number of model parameters, which consequently requires more labeled samples at the training stage. 2) Some CNN trackers can run at over 100 fps on GPU, but run very slowly on CPU due to the high complexity of network structure. In order to deal with these issues, in this thesis we propose a novel frame-pair based CNN architecture, which can balance tracking speed and accuracy. Instead of adopting two-stream CNNs, we fuse frame pairs in the input stage, resulting in a single-stream CNN tracker with much fewer parameters. The proposed tracker can learn generic motion patterns of objects with less video data compared with previous CNN based methods. The evaluation is conducted on the VOT14, OTB50 and OTB100 benchmark datasets. The proposed tracker achieves competitive results with state-of-the-arts but with much less memory and complexity. Our tracker can track objects in a speed of over 100 (30) fps with a GPU (CPU), much faster than most existing CNN based trackers. For age and gender recognition, CNN based methods have achieved state-of-the-art accuracy but they are time consuming for mobiles or low-end PCs for the following two issues. 1) Complex CNN architecture. Most of CNN based methods directly employ the popular architectures (e.g., AlexNet and VGG), which are very complex and overdesigned for age and gender recognition. 2) Regarding age and gender recognition as two independent problems. Actually, age and gender recognition are two highly correlated tasks about facial attributes, and it will be beneficial if we can optimize these two tasks together. In this thesis, we propose a lightweight deep model to recognize age and gender from a face image via a joint regression model. Specifically, our model employs a multi-task learning scheme to learn shared features for these two correlated tasks in an end-to-end manner. Extensive experimental results on the recent Adience benchmark demonstrate that our model achieves competitive recognition accuracy with the state-of-the-art methods but with much faster speed, i.e., about 10 times faster in the testing phase.
Rights: All rights reserved
Access: open access

Files in This Item:
File Description SizeFormat 
991021965755003411.pdfFor All Users2.19 MBAdobe PDFView/Open


Copyright Undertaking

As a bona fide Library user, I declare that:

  1. I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
  2. I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
  3. I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.

By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.

Show full item record

Please use this identifier to cite or link to this item: https://theses.lib.polyu.edu.hk/handle/200/9138