Navigating the unseen : out-of-distribution generalization and detection in open environments

Zhang, Yabin

Full metadata record

DC Field	Value	Language
dc.contributor	Department of Computing	en_US
dc.contributor.advisor	Zhang, Lei (COMP)	en_US
dc.creator	Zhang, Yabin	-
dc.identifier.uri	https://theses.lib.polyu.edu.hk/handle/200/13423	-
dc.language	English	en_US
dc.publisher	Hong Kong Polytechnic University	en_US
dc.rights	All rights reserved	en_US
dc.title	Navigating the unseen : out-of-distribution generalization and detection in open environments	en_US
dcterms.abstract	In open environments, artiﬁcial intelligence (AI) models face two main types of out-of-distribution (OOD) samples that deviate from their training data (i.e., in-distribution data): covariate-shifted OOD samples, which are consistent in semantics but diﬀer in covariate shifts, and semantic-shifted OOD samples, which have diﬀerent semantic labels. Such OOD samples can severely challenge the safety and reliability of AI systems by inducing high-conﬁdence errors. Comprising four studies, this thesis targets enhancing generalization to covariate shifts through methods like style augmentation and memory networks, and improving detection of semantic-shifted samples using strategies such as prompt tuning and adaptive negative proxies. These eﬀorts are crucial for the reliable performance of AI models in open environments.	en_US
dcterms.abstract	In Chapter 1, we introduce the concepts of covariate-shifted and semantic-shifted OOD samples and review existing methods and challenges associated with OOD generalization and detection. We detail the objectives, contributions, and the structure of the thesis. In Chapter 2, we introduce Exact Feature Distribution Matching (EFDM), a novel technique that advances style augmentation by integrating higher-order statistics for enhanced generalization to covariate-shifted OOD samples. EFDM employs empirical Cumulative Distribution Functions and a Sort-Matching technique, demonstrating superior performance over traditional methods in extensive experiments. In Chapter 3, we develop the dual memory networks to extend the generalization capabilities of vision-language models (VLMs) like CLIP. This strategy signiﬁcantly improves performance on both in-distribution and covariate-shifted OOD samples, validated through rigorous testing across a variety of datasets. Moving forward, Chapters 4 and 5 focus on detecting semantic-shifted OOD samples. In Chapter 4, we introduce Label-driven Automated Prompt Tuning (LAPT) to address the limitations of manual prompt engineering in VLMs-based OOD detection. Using distribution-aware prompts and automatically collected negative training data, LAPT reduces manual eﬀort and improves detection performance across various tasks. In Chapter 5, we focus on constructing adaptive negative proxy with test images in a test-time adaption manner. This approach facilitates online mining of negative test samples, enhancing the model’s ability to distinguish between in-distribution and OOD instances, as proven on standard benchmarks.	en_US
dcterms.abstract	In summary, this thesis contributes to the ﬁeld of OOD generalization and detection by introducing innovative methods that enhance performance and reduce manual intervention. By addressing speciﬁc challenges associated with covariate and semantic shifts in OOD samples, these studies signiﬁcantly improve the reliability and safety of AI systems in open environments.	en_US
dcterms.extent	xix, 157 pages : color illustrations	en_US
dcterms.isPartOf	PolyU Electronic Theses	en_US
dcterms.issued	2024	en_US
dcterms.educationalLevel	Ph.D.	en_US
dcterms.educationalLevel	All Doctorate	en_US
dcterms.LCSH	Artificial intelligence	en_US
dcterms.LCSH	Machine learning	en_US
dcterms.LCSH	Hong Kong Polytechnic University -- Dissertations	en_US
dcterms.accessRights	open access	en_US

Files in This Item:

File	Description	Size	Format
7844.pdf	For All Users	14.72 MB	Adobe PDF	View/Open

Copyright Undertaking

As a bona fide Library user, I declare that:

I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.

By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.

Show simple item record

Please use this identifier to cite or link to this item: https://theses.lib.polyu.edu.hk/handle/200/13423