Author: Li, Zhaojin
Title: Bridging 3D reconstruction and semantic segmentation for planetary surface characterization
Advisors: Wu, Bo (LSGI)
Degree: Ph.D.
Year: 2025
Subject: Planets -- Surfaces
Three-dimensional imaging
Optical data processing
Image segmentation
Geomorphology
Hong Kong Polytechnic University -- Dissertations
Department: Department of Land Surveying and Geo-Informatics
Pages: xvii, 188 pages : color illustrations
Language: English
Abstract: The study of planetary surfaces holds great significance, as it ensures the safety of various in-situ space exploration missions and uncovers the evolutionary history of planets. Over the past several decades, a wealth of data has been collected, revealing the surfaces of celestial bodies such as the Moon, Mars, Mercury, and some asteroids. Among the various types of products, topographic maps and landform datasets derived from optical images have garnered significant attention. These products combine geometric and semantic information, providing a more complete description of surface characteristics.
For decades, three-dimensional (3D) reconstruction of planetary surfaces has been a focus of extensive research, utilizing techniques such as laser altimetry, photogrammetry, and photoclinometry to produce numerous topographic products across various planets. It is only in the past decade that semantic segmentation of planetary surfaces has gained significant attention, propelled by the rise of deep learning. This advancement has facilitated more robust results and decreased the dependence on human labor. Despite thorough research in both 3D reconstruction and semantic segmentation, challenges persist due to the poorly textured nature of planetary surfaces. While both 3D reconstruction and semantic segmentation originate from the same data source, they explore the underlying information in distinct ways.
Hence, this research aims to develop innovative approaches that bridge 3D reconstruction and semantic segmentation to achieve enhanced performance in planetary surface characterization, described through both geometric and semantic perspectives. Starting with high-resolution 3D reconstruction through the fusion of laser altimetry, photogrammetry, and photoclinometry, the resulting topographic models are employed to improve semantic segmentation via enhanced training data and geometric supervision. Conversely, the derived semantic cues are fed back to refine 3D reconstruction, to tackle complex features and dense matching scenarios.
In the first approach, a rigorous and pixel-wise 3D reconstruction is performed by integrating laser altimetry data, grayscale images, and radiance data. During the photogrammetric processing, we propose an exterior-orientation-parameter-guided feature matching algorithm and an object-based dense matching strategy to address the challenges of feature correspondence caused by the poorly textured nature of planetary surfaces. The resulting photogrammetric digital elevation model (DEM) is further refined using a photoclinometry process, enabling the generation of a topographic product with pixel-wise resolution and enhanced geometric detail. A comprehensive evaluation based on various satellite images verifies the generalizability and effectiveness of the proposed algorithm. The overall geometric difference is within 10% relative to publicly available DEM references, and qualitative assessments indicate the retrieval of pixel-wise details.
Building upon the first approach, semantic segmentation is enhanced by incorporating 3D information. A semi-automatic dataset construction method is proposed, leveraging both 3D mesh models and recovered parameters of the cameras from the 3D reconstruction stage. This approach simultaneously generates textured RGB images, semantically labeled images, depth images, and XYZ images, provided there are enough manually labeled images. To augment the manually labeled segments for training dataset construction, a depth-enhanced transformer-based network is designed to generate additional semantic segments. Furthermore, a Siamese transformer-based network is proposed to extract transform-invariant multi-level semantic features, introducing tie-points as constraints to supervise semantic class consistency. Approximately 400 images are manually annotated and then augmented to around 25,000 images to construct the training dataset. The segmentation network achieved an overall accuracy of 88%, validating the effectiveness of the proposed method. Additionally, the consistency between overlapping images reaches 97.1%, compared with 91.2% for the original Swin Transformer. This demonstrates the necessity of the Siamese architecture and tie-point supervision.
In the third approach, the retrieved semantic cues are further utilized for enhanced 3D reconstruction. During the feature matching phase, these semantic cues are integrated to enhance the construction of feature descriptors, contextual aggregation, and outlier removal, resulting in robust cross-station matches. These matches facilitate accurate bundle adjustment, effectively linking more challenging images. In the dense matching stage, a frequency-domain similarity measurement is proposed and combined with semantic cues to enhance matching reliability and preserve surface discontinuities. Finally, the disparity maps generated from the matching results are used to derive 3D point clouds, which are then meshed to create 3D surface models. Experiments are conducted on two image datasets of typical Martian scenes collected by the Zhurong rover to evaluate the performance of the proposed method. The results indicate that image residuals of around 1.5 pixels on average are achieved for the bundle adjustment of cross-station images using the matched feature points, and the final generated 3D models exhibit an accuracy better than 0.5 m. Compared with cutting-edge commercial software, the generated 3D models from our method exhibit superior quality in terms of both accuracy and coverage, highlighting the effectiveness of the semantic-aware image matching algorithm.
In summary, data from laser altimetry, visual camera, radiance information, and the frequency domain are integrated to advance current 3D reconstruction and semantic segmentation techniques for planetary surfaces, optimally revealing both geometric and contextual information. This research holds the potential to significantly enhance the characterization of planetary surfaces, improve the surface operations of exploration missions, and deepen our understanding of the relevant geomorphological and geological implications.
Rights: All rights reserved
Access: open access

Files in This Item:
File Description SizeFormat 
8633.pdfFor All Users10.56 MBAdobe PDFView/Open


Copyright Undertaking

As a bona fide Library user, I declare that:

  1. I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
  2. I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
  3. I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.

By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.

Show full item record

Please use this identifier to cite or link to this item: https://theses.lib.polyu.edu.hk/handle/200/14177