|Title:||On representation based pattern classification models|
|Subject:||Pattern recognition systems|
Hong Kong Polytechnic University -- Dissertations
|Pages:||xviii, 192 leaves : illustrations ; 30 cm|
|Abstract:||In computer vision and pattern recognition, there are a variety of image based classification tasks, e.g., face recognition, action recognition, object recognition, texture classification, handwritten digit recognition, etc. How to choose a suitable classifier for the given classification task is not a trivial problem, and it depends on data type, data distribution, data size, and feature property. According to "no free lunch" theorem in machine learning, there is no one classifier that can always achieve the state-of-the-art performance in all classification tasks. Intuitively, a robust, efficient, and scalable classifier with good understandability, scalability and generalization ability is always desired. Representation based classification has been widely used in pattern classification and achieves superior performance. It is based on the assumption that a query sample can be more accurately approximated by a linear combination of training samples of its class than other classes. Many representation based classification models have been developed, including sparse/collaborative representation, low-rank representation, robust representation, kernel representation, generic representation, multi-modal/cross-modal representation, etc. Representation residuals in these models are discriminative and a query sample can be classified to the class with the minimal reconstruction residual. Meanwhile, representation coefficients can also be used as features to enhance classification. In addition, in middle-level feature extraction, in contrast to vector quantization, sparse coding can be introduced to obtain a soft representation for classification. Although representation based classification models have achieved a great success in different classification tasks, there are still many problems remaining. When there are only a small number of training samples, the representation tends to be over-determined and therefore the query sample may not be well represented. When the number of the training samples is very large, the time complexity and memory consumption of representation based classifiers becomes a challenging issue. Besides, the existing representation based classifiers are mostly designed to accomplish single image based classification tasks. However, for video based face recognition and multi-view object recognition, the task becomes an image set classification problem. It is demanded to extend representation based classifiers from image based to image set based models. Finally, most existing representation based classifiers are non-discriminative in the representation process. It is interesting to investigate if the samples can be projected to a discriminative feature space to enhance the classification performance. In this thesis, we aim to develop new representation based classification models for small sample size problems, big sample size problems, image set classification problems, and discriminative representation problems, respectively. In Chapter 2, to solve the small sample size problem in face recognition, a patch based collaborative representation classifier (PCRC) is proposed. Both the query and gallery face images are divided into patches and then the query patch is represented by the gallery patch dictionary. Classification outputs of all the patches are combined by majority voting to get the final output. As PCRC is sensitive to patch size, a multi-scale PCRC is proposed to fuse the classification outputs of different path sizes by margin distribution optimization.|
In Chapter 3, a local generic representation (LGR) based approach is proposed for face recognition with single sample per person. A generic intra-class variation dictionary is constructed from a generic dataset, and it can well compensate for the face variations lacked in the gallery set. A correntropy based metric is adopted to measure the loss of each patch so that the importance of different patches in face recognition can be more robustly evaluated. In Chapter 4, a self-representation induced classifier (SRIC) is proposed for representation with big sample size. Different from the existing sample-level representation, we proposed representation based classifiers from the perspective of feature-level representation. The time complexity of SRIC is only related with feature dimension and the number of classes. Hence, it is very suitable for classification tasks with a large amount of training samples and a small number of classes. In Chapter 5, an image set based collaborative representation model is proposed for image set based face recognition. Considering the distinctiveness of samples in the query image set and the correlation between the gallery image sets, we model both the query and gallery image set as hulls. Then the hull of the query image set is collaboratively represented on the gallery image sets. Regularized hull and kernel convex hull are both considered to develop robust image set based collaborative representation classifiers. In Chapter 6, by considering representation based classifiers as point-to-set distance based classifiers, we extended distance metric learning from point-to-point distance to point-to-set and set-to-set distance. The metric learning problem is modeled as a sample pair classification task and can be efficiently solved by standard support vector machine solvers. To sum up, in this thesis we developed patch based collaborative representation, local generic representation, regularized self-representation, image set based collaborative representation, and point-to-set/set-to-set distance metric learning methods to address the representation problems with small sample size, big sample size, and image sets for pattern recognition, respectively. Our extensive experimental results demonstrated the state-of-the-art performance of the proposed methods. In the future work, we will investigate generic dictionary learning for face recognition in the wild, cross-modal/multi-modal dictionary learning and metric learning methods under the representation based pattern classification framework.
As a bona fide Library user, I declare that:
- I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
- I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
- I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.
By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.
Please use this identifier to cite or link to this item: