Data-driven deep reinforcement learning for decision-making applications

Wang, Jia

Full metadata record

DC Field	Value	Language
dc.contributor	Department of Computing	en_US
dc.contributor.advisor	Cao, Jiannong (COMP)	en_US
dc.creator	Wang, Jia	-
dc.identifier.uri	https://theses.lib.polyu.edu.hk/handle/200/11095	-
dc.language	English	en_US
dc.publisher	Hong Kong Polytechnic University	en_US
dc.rights	All rights reserved	en_US
dc.title	Data-driven deep reinforcement learning for decision-making applications	en_US
dcterms.abstract	Decision-making applications have become an important part of today's competitive, knowledge-based society, which benefits many important areas. Significant progress has been made in machine learning recently, thanks to the availability of data not previously available. By learning from past data, machine learning can make better decisions than relying solely on domain knowledge. Among various machine learning algorithms, reinforcement learning (RL) is the most promising algorithm because it learns to map current conditions on decision solutions and considers the impact of current decisions on subsequent decisions. Typically, reinforcement learning learns through trial-and-error using the harvested data based on its own experience to make informed decisions. In many practical applications, since there exists a large amount of offline collected data with rich prior information, the ability to learn from big data becomes the key to reinforcement learning to solve realistic decision-making problems. Unlike traditional RL methods which interact with an online environment, learning the strategy from a fixed dataset is particularly challenging. The reasons are threefold. First, for data which are generated from the daily system operations, they are not independent and identically distributed. By training on partial dataset, an RL agent can learn the converged model which makes it reluctant to explore the remaining data and further improve the model performance. Second, without the proper understanding of underlying data distribution, an RL agent may learn a decision-making strategy that easily over-fits to the observed samples in the training set but fail to generalize well on unseen samples in the testing set. Third, an RL training process can be very unstable, when data are noisy and highly variant.	en_US
dcterms.abstract	In this thesis, we have studied data-driven reinforcement learning, aiming to derive decision strategies from big data collected offline. The first contribution of this thesis comes from enabling an RL agent to learn strategies from data with repetitive patterns. To force an RL agent to fully "explore" massive data, we partition the historical big dataset into multi-batch datasets. Typically, we study in both theory and practice how an RL agent can incrementally improve the strategy by learning from the multi-batch datasets. The second contribution of this thesis comes from that we explore the underlying data property distribution under the reinforcement learning scheme. With the generative distribution, one can select the hardest (most representative) samples to train the strategy model, thus achieving a better application performance. The third contribution of the thesis comes from we apply the RL method to learn strategies from high variance data. Specifically, we bound the distribution of the parameters in the new strategy relatively close to its predecessor strategy to stabilize the training. Finally, through data-driven reinforcement learning, we thoroughly study various applications, including social analysis, dynamic resource allocation, and multi-agent pattern formation.	en_US
dcterms.extent	xvi, 100 pages : color illustrations	en_US
dcterms.isPartOf	PolyU Electronic Theses	en_US
dcterms.issued	2021	en_US
dcterms.educationalLevel	Ph.D.	en_US
dcterms.educationalLevel	All Doctorate	en_US
dcterms.LCSH	Reinforcement learning	en_US
dcterms.LCSH	Machine learning	en_US
dcterms.LCSH	Big data	en_US
dcterms.LCSH	Hong Kong Polytechnic University -- Dissertations	en_US
dcterms.accessRights	open access	en_US

Files in This Item:

File	Description	Size	Format
5553.pdf	For All Users	11.53 MB	Adobe PDF	View/Open

Copyright Undertaking

As a bona fide Library user, I declare that:

I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.

By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.

Show simple item record

Please use this identifier to cite or link to this item: https://theses.lib.polyu.edu.hk/handle/200/11095