Attention-driven image interpretation, annotation and retrieval

Pao Yue-kong Library Electronic Theses Database

Attention-driven image interpretation, annotation and retrieval

 

Author: Fu, Hong
Title: Attention-driven image interpretation, annotation and retrieval
Degree: Ph.D.
Year: 2007
Subject: Hong Kong Polytechnic University -- Dissertations.
Image analysis.
Image transmission.
Image processing -- Mathematics.
Department: Dept. of Electronic and Information Engineering
Pages: xx, 169 leaves : ill. (chiefly col.) ; 30 cm.
Language: English
InnoPac Record: http://library.polyu.edu.hk/record=b2116774
URI: http://theses.lib.polyu.edu.hk/handle/200/160
Abstract: This thesis presents novel attention-driven techniques for image interpretation, annotation and retrieval. Four main contributions are reported in the thesis. They include: (1) an attention-driven image interpretation method with application to image retrieval; (2) an efficient algorithm for attention-driven image interpretation from image segments; (3) a pre-classification technique to classify "attentive" images and "non-attentive" images and a combined retrieval strategy; and (4) a semantic network for image annotation by modelling attentive objects and their correlations. In the first investigation, we propose an attention-driven image interpretation method to pop out visually attentive objects from an image iteratively by maximizing a global attention function. In the method, an image is interpreted as containing several perceptually attended objects as well as the background, where each object is measured by an attention value. The attention values of attentive objects are then mapped to importance measures so as to facilitate the subsequent image retrieval. An attention-driven matching algorithm is proposed based on a retrieval strategy emphasizing attended objects. Experiments on 7376 Hemera color images annotated by keywords show that the retrieval results from our attention-driven approach compare favorably with conventional methods, especially when important objects are seriously concealed by the irrelevant background. In the second investigation, the computation issue of an attention-driven image interpretation is addressed. The object reconstruction is a combinational optimization problem with a complexity of 2N which is computationally very expensive when the number of segments N is large. We formulate the attention-driven image interpretation process by a matrix representation. An efficient algorithm based on elementary transformations of the matrix is proposed to reduce the computational complexity to 3N(N-1)2/2. Experimental results on both synthetic and real data show an acceptable small degradation to the accuracy of object formulation while the processing speed is significantly increased. In the third investigation, an all-season image retrieval system is proposed. The system can handle both the images with clearly attentive images and those without clearly attentive images. Firstly, considering the visual contrasts and spatial information of an image, a neural network is trained to classify it as an "attentive" or "non-attentive" image by using the Back Propagation Through Structure (BPTS) algorithm. In the second step, an "attentive" image is processed by an attentive retrieval strategy emphasizing attentive objects. Meanwhile, a "non-attentive" image is processed by a fusing-all retrieval system. An improved performance can be obtained by using this combined system. In the fourth investigation, we propose an image annotation method based on attentively interpreted images. Firstly, a number of interpreted images are annotated manually as training samples. Secondly, a semantic network is constructed, which stores both the visual classifiers of attentive objects and the correlations among concepts. Finally, an annotation strategy is proposed to utilize the semantic network to annotate objects. Experimental results show that the trained semantic network is able to produce good annotation results, especially when a visual classifier does not produce a precise concept.

Files in this item

Files Size Format
b21167746.pdf 5.200Mb PDF
Copyright Undertaking
As a bona fide Library user, I declare that:
  1. I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
  2. I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
  3. I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.
By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.

     

Quick Search

Browse

More Information