New insights into meta-learning : from theory to algorithms

Chen, Jiaxin

Full metadata record

DC Field	Value	Language
dc.contributor	Department of Computing	en_US
dc.contributor.advisor	Chung, Fu-lai (COMP)	en_US
dc.creator	Chen, Jiaxin	-
dc.identifier.uri	https://theses.lib.polyu.edu.hk/handle/200/11450	-
dc.language	English	en_US
dc.publisher	Hong Kong Polytechnic University	en_US
dc.rights	All rights reserved	en_US
dc.title	New insights into meta-learning : from theory to algorithms	en_US
dcterms.abstract	Deep learning has surpassed human-level performance in various domains where the training data is abundant. Human intelligence, however, has the ability to adapt to a new environment with little experiences or recognize new categories after seeing just a few training samples. To endow machines such humanlike few-shot learning skills, meta-learning provides promising solutions and has generated a surge of interest recently. A meta-learning algorithm (meta-algorithm) trains over a large number of i.i.d. tasks sampled from a task distribution and learns an algorithm (inner-task algorithm) that can quickly adapt to a future task with few training data. From a generalization view, traditional supervised learning studies the generalization of a learned hypothesis (predictor) to novel data samples in a given task, whereas meta-learning explores the generalization of a learned algorithm to novel tasks. In this thesis, we provide new insights into the generalization of modern meta-learning based on theoretical analysis and propose new algorithms to improve its generalization ability. We provide theoretical investigations into Support/Query (S/Q) Episodic Training Strategy which is widely believed to improve the generalization and applied in modern meta-learning algorithms. We analyze the generalization error bound of generic meta-learning algorithms trained with such strategy via a stability analysis. We show that the S/Q episodic training strategy naturally leads to a counterintuitive generalization bound of O(1/√n), which only depends on the number of task n but independent of the inner-task sample size m. Under the common assumption m << n for few-shot learning, the bound of O(1/√n) implies strong generalization guarantee for modern meta-learning algorithms in the few-shot regime. We further point out that there still exist limitations of the existing modern meta-training strategy, i.e., the optimization procedure of model parameters following the strategy of differentiating through the inner-task optimization path. To satisfy this requirement, the inner-task algorithms should be solved analytically and this significantly limits the capacity and performance of the learned inner-task algorithms. Hence, we propose an adaptation-agnostic meta-training strategy that removes such dependency and can be used to train inner-task algorithms with or without analytical expressions. Such general meta-training strategy naturally leads to an ensemble framework that can efficiently combine various types of algorithms to achieve better generalization. Motivated by a closer look at metric-based meta-algorithms which shows high generalization ability over the few-shot classification problems, we propose a generic variational metric scaling framework which is compatible with metric-based meta-algorithms and achieves consistent improvements over the standard few-shot classification benchmarks. Finally, as majority of existing meta-algorithms focus on within-domain generalization, we further consider cross-domain generalization over a realistic yet more challenging few-shot classification problem, where a large discrepancy exists between the task distributions of training domains and a test domain. A gradient-based hierarchical meta-learning framework (M2L) is proposed to solve such problem. Finally, we discuss open questions and future directions in meta-learning.	en_US
dcterms.extent	ix, 86 pages : color illustrations	en_US
dcterms.isPartOf	PolyU Electronic Theses	en_US
dcterms.issued	2021	en_US
dcterms.educationalLevel	Ph.D.	en_US
dcterms.educationalLevel	All Doctorate	en_US
dcterms.LCSH	Machine learning	en_US
dcterms.LCSH	Data mining	en_US
dcterms.LCSH	Algorithms	en_US
dcterms.LCSH	Hong Kong Polytechnic University -- Dissertations	en_US
dcterms.accessRights	open access	en_US

Files in This Item:

File	Description	Size	Format
5966.pdf	For All Users	3.74 MB	Adobe PDF	View/Open

Copyright Undertaking

As a bona fide Library user, I declare that:

I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.

By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.

Show simple item record

Please use this identifier to cite or link to this item: https://theses.lib.polyu.edu.hk/handle/200/11450