Author: Li, Shuai
Title: Exploring efficient feature extractor and label assigner for object detection
Advisors: Zhang, Lei (COMP)
Degree: Ph.D.
Year: 2023
Subject: Image processing -- Digital techniques
Image analysis -- Data processing
Computer vision
Hong Kong Polytechnic University -- Dissertations
Department: Department of Computing
Pages: xviii, 120 pages : color illustrations
Language: English
Abstract: With the rapid development of deep learning techniques, various types of object detectors have been continuously developed to push the boundaries of detection performance. Modern detectors commonly encounter two core issues that significantly affect the final performance: feature extraction and label assignment. Therefore, in this thesis, we aim to design an efficient feature extractor and label assigner for generic object detection.
In Chapter 1, we provide an overview of the pipeline of widely-used object de­tectors and discuss the contributions and organization of this thesis. In Chapter 2, we focus on improving the anchor feature extraction process in one-stage detectors. Anchor features are fundamental training units in object detection and they are extracted from the image feature produced by the backbone. We introduce two efficient modules to enhance this process. The first is a bi-directional feature fusion module that combines both low-level detail information and high-level semantic information to enrich the image feature. The second is the dynamic anchor feature selection mod­ule, which aligns the receptive field of anchor features with the anchor shape. The anchor features extracted in this way are precise and robust, effectively easing the training burden of the detector. In Chapter 3, we introduce a dual weighting (DW) label assignment scheme for NMS-based one-stage detectors. The primary goal of label assignment is to assign a positive or negative label to each anchor to facilitate the training process. To provide finer supervision signals, we propose a method that assigns each anchor both a soft positive label and a soft negative label, which is achieved through two carefully designed assigners. DW is fully compatible with the detection evaluation metric and can significantly enhance the detector’s performance without introducing any additional parameters.
In Chapter 4, we introduce a one-to-few label assignment (LA) method for end­-to-end (NMS-free) dense detection. Our approach combines the advantages of one­-to-one LA and one-to-many LA by gradually reducing the number of positive training samples from ‘many’ to ‘one’ during the training process. By doing so, the detector can learn a robust feature representation and prevent the occurrence of duplicated predictions. In Chapter 5, we delve into the LA problem in the realm of unsupervised domain adaptation (UDA) for object detection. To tackle this issue, we put forward a novel approach called the Category Dictionary Guided (CDG) UDA model, which aims to generate more reliable pseudo labels. In essence, our approach involves learning several category dictionaries from the source domain, and then utilizing them to represent the samples in the target domain. The residual of the representation is used as a metric to select high-quality pseudo labels.
To summarize, this thesis presents four approaches that aim at enhancing the feature extraction and label assignment processes in object detection, including a bi-directional and dynamic anchor feature extractor, a dual weighting label assigner for NMS-based detector, a one-to-few label assigner for end-to-end dense detection, and a category dictionary guided label assigner for cross-domain detection. The effi­ciency and effectiveness of these methods have been demonstrated through extensive experiments on several benchmarks.
Rights: All rights reserved
Access: open access

Files in This Item:
File Description SizeFormat 
7140.pdfFor All Users14.68 MBAdobe PDFView/Open


Copyright Undertaking

As a bona fide Library user, I declare that:

  1. I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
  2. I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
  3. I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.

By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.

Show full item record

Please use this identifier to cite or link to this item: https://theses.lib.polyu.edu.hk/handle/200/12707