|Title:||Video-Based Product Search Services|
|Subject:||Hong Kong Polytechnic University -- Dissertations|
Web search engines
|Department:||Department of Computing|
|Pages:||vii, 62 leaves : ill. ; 30 cm.|
|Abstract:||In this thesis, a basic idea of the Video-Based Product Search Services is proposed. The main challenges in the web scale computer vision application is the robust image feature, efficient indexing solution and corresponding fast search algorithms. For the image feature, the SIFT feature is applied as it is invariant in scale and rotation. SIFT features are extracted from the product images which are crawled from the Internet by the web crawler. After the SIFT repository is built, the corresponding ground truth table will be generated to support the indexing and searching steps. The k-d tree is used for indexing solution combined with the principal component analysis. The k-d tree leaf-nodes are stored with the entries which are assigned to the each SIFT point in the repository. The k-d tree is good at indexing low dimensional features, so the PCA is a good choice to reduce the dimensionality of SIFT feature vectors. For the searching part, the query can be a single image or a short video. For the image query, the SIFT points are extracted first and the image is represented by a set of SIFT points. After dealing with the preprocessing steps, the dimension reduction operation is applied to keep the dimension of the feature vectors as same as the feature vectors stored in the prebuilt data set. The each SIFT point from the query image will perform a searching step to find a best match in the k-d tree. The each SIFT point will search the tree from the root node to a leaf node. On each level of the k-d tree the point will chose its way down based on the value of the splitting point. After reached the leaf level, there will be a NN-query problem for matching the incoming SIFT point with all the points whose index entries are stored in the corresponding leaf node. There will be an threshold to adapt and control the matching algorithm. The video query will contain much more information, and if all the SIFT points extracted from the video frames perform the searching step, it will take minutes to return the results. However we can apply a spatio-temporal pruning approach to prune the SIFT points from the query video to keep the stable SIFT points. However the preprocessing step will take longer time comparing to the image query. The performance results of the image and video queries are given in this report and the simulation results are pretty encouraging.|
|Rights:||All rights reserved|
Files in This Item:
|b24268690.pdf||For All Users (off-campus access for PolyU Staff & Students only)||4.7 MB||Adobe PDF||View/Open|
As a bona fide Library user, I declare that:
- I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
- I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
- I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.
By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.
Please use this identifier to cite or link to this item: