Learning approaches for scene localization and quality scene reconstruction

Li, Chu Tak

Full metadata record

DC Field	Value	Language
dc.contributor	Department of Electronic and Information Engineering	en_US
dc.contributor.advisor	Siu, Wan-chi (EIE)	en_US
dc.contributor.advisor	Lun, Pak Kong Daniel (EIE)	en_US
dc.creator	Li, Chu Tak	-
dc.identifier.uri	https://theses.lib.polyu.edu.hk/handle/200/12268	-
dc.language	English	en_US
dc.publisher	Hong Kong Polytechnic University	en_US
dc.rights	All rights reserved	en_US
dc.title	Learning approaches for scene localization and quality scene reconstruction	en_US
dcterms.abstract	Vision-based autonomous driving techniques are popular in both academia and industry because of the highly cost-effective commodity cameras with high quality output images and the information richness of images. Global Navigation Satellite Systems (GNSS) is well-known for many real-world ego-localization and other related applications. However, GNSS suffers from reflection and blocking due to dense concrete buildings and tall trees, especially in the densely populated urban areas, like Hong Kong. There are also other solutions using high-level sensors like Lidar, Radar and 360 RGB-D cameras. Nevertheless, these solutions still have their respective limitations and are not widely used in various commercial products. Therefore, various technologies including visual place recognition and reconstruction methods discussed in this thesis will be required for achieving a comprehensive autonomous driving system.	en_US
dcterms.abstract	Place recognition or localization is an important element to autonomous driving system. Accurate ego location information is crucial for either removing past accumulated errors or future planning. The challenges lie in the variations in appearance, speeds, lighting environments, perspectives and objects. Therefore, we develop a fast algorithm for place recognition, for which fast tracking with the use of historical information and effective representation of a frame have been comprehensively studied to achieve satisfactory recognition performance and minimize computational cost. We name the use of historical information as a tubing strategy which emphasizes the temporal correlation between consecutive input frames.	en_US
dcterms.abstract	We take the advantages of recent deep learning techniques; also remove two main barriers of Convolutional Neural Networks (CNNs) , i.e., heavy computational cost and large amount of labelled data, such that deep learning techniques can be used for efficient place recognition methods. We study lightweight CNN models to offer efficient feature extraction and improve an existing automatic training data generation module by considering more variations in conditions. We further propose a way to adaptively use the historical information to tackle the tasks of unknown initial location and efficient recognition. The proposed methods outperform other state-of-the-art methods in terms of both recognition performance and complexity.	en_US
dcterms.abstract	To ensure the quality of the extracted features from images, we also study object removal by means of deep learning-based image inpainting for scene reconstruction. By removing unwanted objects like moving vehicles and pedestrians in images, we can have clean images for place recognition. We propose Deep Generative Inpainting Network (DeepGIN) and inpainting model with Multi-Dilation Fusion Block (MDFB) and auxiliary attention learning branch which seek for a better balance of pixel-wise accuracy and visual quality. We show that our proposed models can handle wild images by testing them on several publicly available datasets, Flickr-Faces-HQ (FFHQ), The Oxford Buildings and Places2 datasets. We demonstrate that our inpainting results can be used in other high-level computer vision tasks such as face verification and semantic segmentation. We believe that the inpainting results can also be used in place recognition.	en_US
dcterms.abstract	For future research work, we target at developing a more comprehensive recognition system for which our inpainting models are used as pre-processing module to obtain better input images and our tubing strategy is applied to the post-processing stage to obtain better recognition performance. Apart from combining the techniques discussed in this thesis, we would like to develop an online learning strategy to keep the understanding of a path up to date for further enhancing life-long recognition performance.	en_US
dcterms.extent	xiv, 141 pages : color illustrations	en_US
dcterms.isPartOf	PolyU Electronic Theses	en_US
dcterms.issued	2023	en_US
dcterms.educationalLevel	M.Phil.	en_US
dcterms.educationalLevel	All Master	en_US
dcterms.LCSH	Automated vehicles -- Data processing	en_US
dcterms.LCSH	Automated vehicles -- Control	en_US
dcterms.LCSH	Machine learning	en_US
dcterms.LCSH	Hong Kong Polytechnic University -- Dissertations	en_US
dcterms.accessRights	open access	en_US

Files in This Item:

File	Description	Size	Format
6700.pdf	For All Users	5.33 MB	Adobe PDF	View/Open

Copyright Undertaking

As a bona fide Library user, I declare that:

I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.

By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.

Show simple item record

Please use this identifier to cite or link to this item: https://theses.lib.polyu.edu.hk/handle/200/12268