| Author: | Yu, Xiaotong |
| Title: | Intelligent perception systems for mobile robot : from semantic-aware planning to hybrid quantum-classical view optimisation |
| Advisors: | Chen, Chang Wen (COMP) |
| Degree: | Ph.D. |
| Year: | 2025 |
| Department: | Department of Computing |
| Pages: | xvi, 147 pages : color illustrations |
| Language: | English |
| Abstract: | Intelligent perception represents a critical challenge for mobile robotic systems operating in complex, unknown environments. Current approaches face fundamental limitations in semantic understanding, optimisation quality for viewpoint selection, and robust sensing under variable conditions. This thesis investigates these challenges and proposes novel solutions to enhance the perceptual capabilities of autonomous mobile robots. The research is motivated by three key observations: first, traditional Next-Best-View planning algorithms typically optimize for geometric coverage without considering semantic significance, resulting in inefficient exploration when specific objects hold particular importance; second, classical optimization methods for viewpoint selection often converge to suboptimal solutions due to the vast, high-dimensional solution space; and third, the predominant reliance on RGB imagery limits robustness in challenging lighting conditions and raises privacy concerns in sensitive applications. To address these challenges, this thesis presents three complementary contributions. The first introduces a semantic-aware Next-Best-View (S-NBV) framework that incorporates semantic information alongside visibility metrics in a unified information gain formulation, enabling efficient search-and-acquisition manoeuvres. Experimental validation demonstrates up to 27.46% enhancement in region-of-interest reconstruction efficiency compared to state-of-the-art methods. The second contribution develops a Hybrid Quantum-Classical Next-Best-View (HQC-NBV) framework that leverages quantum computing principles to more effectively navigate the complex solution space of viewpoint selection. Using a novel Hamiltonian formulation and bidirectional entanglement patterns, this approach achieves up to 49.2% higher exploration efficiency than classical methods, establishing a pioneering connection between quantum computing and robotic perception. The third contribution presents the Cross Shallow and Deep Perception Network (CS-DNet), a lightweight architecture designed for integrating low-coherence depth and thermal modalities. Through spatial information prescreening, implicit coherence navigation, and Segment Anything Model (SAM)-assisted encoder pre-training, CS-DNet achieves performance comparable to triple-modality (RGB-D-T) methods while using only depth and thermal data, and reducing computational requirements by orders of magnitude, demonstrates that effective integration of low-coherence modalities can achieve robust perception in challenging conditions without relying on RGB data, offering both efficiency and inherent privacy advantages. Extensive experiments across diverse scenarios validate the effectiveness of these approaches. The research advances the capabilities of robotic perception systems, enabling more intelligent exploration through semantic awareness, superior viewpoint selection through hybrid quantum-classical optimisation, and robust operation in challenging environmental conditions through effective multi-modal integration. |
| Rights: | All rights reserved |
| Access: | open access |
Copyright Undertaking
As a bona fide Library user, I declare that:
- I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
- I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
- I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.
By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.
Please use this identifier to cite or link to this item:
https://theses.lib.polyu.edu.hk/handle/200/14073

