Novel feature representation and matching techniques for content-based image retrieval

Pao Yue-kong Library Electronic Theses Database

Novel feature representation and matching techniques for content-based image retrieval


Author: Wang, Zhiyong
Title: Novel feature representation and matching techniques for content-based image retrieval
Degree: Ph.D.
Year: 2003
Subject: Hong Kong Polytechnic University -- Dissertations
Image processing -- Digital techniques
Optical pattern recognition
Department: Dept. of Electronic and Information Engineering
Pages: xxii, 157 leaves : ill. ; 30 cm
Language: English
InnoPac Record:
Abstract: This thesis presents novel feature representation schemes and matching techniques for content-based image retrieval. There are three main contributions reported in the thesis. They include: (1) a block-constrained fractal coding scheme and an improved nona-tree decomposition based matching technique for image retrieval; (2) a thinning-based starting point localization method and the application of a fuzzy integral to the combination of different shape feature sets for plant leaf image retrieval; and (3) tree-structured content representation and its adaptive processing for content-based image retrieval with relevance feedback. In the first investigation, an image is partitioned into non-overlap blocks of a size similar to that of an iconic query image. Fractal code is efficiently generated for each block individually. For the similarity measure in matching the fractal code of two images, an improved nona-tree decomposition scheme is adopted to avoid matching the fractal code globally so as to significantly reduce computational complexity. This matching scheme can also eliminate possible false matches in the matching process. With improvements in both fractal coding and image matching, image retrieval with fractal code can be obtained effectively and efficiently. In the second investigation, a robust fuzzy integral method is used to combine three shape feature sets, namely, centroid-contour distance (CCD) curve, moment invariants (MIs), and angle code histogram (ACH), for shape-based plant leaf image retrieval. Different from MIs and the ACH feature set, the CCD curve is neither scale nor rotation invariant. Hence, a normalization scheme is needed for the CCD curve to achieve scale invariance, and an efficient starting point localization method is required to achieve the rotation invariance with the similarity measure of CCD curves. In this investigation, a thinning-based method is proposed to locate possible starting points of a leaf contour to make our approach more computationally efficient for image matching. Our proposed starting point localization method can also benefit other shape representation schemes that are sensitive to starting points. In order to combine the three feature sets objectively yet consistently with human perception, a fuzzy integral is employed to combine the similarity measures of the three shape feature sets. The fuzzy integral approach has two distinct advantages: (a) releasing the user from a burden of tuning the combination parameters required for a weighted summation based approach; (2) guaranteeing an optimal or near optimal combination performance. In the third investigation, a novel structural representation of image content and shape pattern, and its adaptive processing is proposed for content-based image retrieval with relevance feedback. In the research with relevance feedback in content-based image retrieval, both quad-tree decomposition and EdgeFlow-based segmentation approaches were adopted for the tree construction of image content. By using such representation schemes, the image content can be explored from coarse levels to fine levels and from global features to local details through the tree nodes at different levels. In the tree structure, a copy of the same feedforward neural network is applied to each tree node receiving as its input the states of its child nodes and a few simple attributes that can be obtained efficiently. Similar to the BackPropagation Through Time (BPTT) learning algorithm, the BackPropagation Through Structure (BPTS) algorithm is adopted to train the neural network. A relevance feedback technique using the tree structure content representation and the BPTS training algorithm is discussed in the thesis. The proposed relevance feedback technique can easily implement the similarity grading during the user's feedback process. To evaluate the retrieval performance of our approach, a synthesized texture database and a scenery database were tested. For the texture database, only the quad-tree decomposition method was used to construct the tree structure representation. For the scenery database, both the quad-tree decomposition method and the EdgeFlow-based segmentation approach were used to construct the tree structure representation. A comparison is given between our approach of relevance feedback based on the tree structure representation with the BPTS learning and other relevant feedback techniques. In the research with content-based shape pattern retrieval with relevance feedback, the approach adopted is the same as that of content-based image retrieval with tree structure representation and the BPTS learning except that a very different scheme is utilized for the tree construction of shape contours. An approach employing the scale-space filtering is proposed to organize the shape primitives hierarchically in terms of their contribution to the overall shape contour appearance. Such an approach is motivated by the fact that shape details are eliminated when the scale increases during the scale-space evolution; however, those visually significant properties are kept even at a large scale. There are two main advantages to adopt such a representation: (1) it is believed that this representation is an important step in approaching human perception and (2) such a representation is invariant to translation, rotation and scaling, and is robust to noise. Experimental results on artificially generated shape patterns and on gesture patterns show the superiority of our approach compared with other techniques. The thesis also reports the performance of gesture pattern recognition using the multi-scale structural contour representation and the BPTS learning.

Files in this item

Files Size Format
b17329413.pdf 10.69Mb PDF
Copyright Undertaking
As a bona fide Library user, I declare that:
  1. I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
  2. I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
  3. I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.
By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.


Quick Search


More Information