Full metadata record
DC FieldValueLanguage
dc.contributorDepartment of Computingen_US
dc.contributor.advisorCao, Jiannong (COMP)en_US
dc.creatorWang, Jia-
dc.identifier.urihttps://theses.lib.polyu.edu.hk/handle/200/11095-
dc.languageEnglishen_US
dc.publisherHong Kong Polytechnic Universityen_US
dc.rightsAll rights reserveden_US
dc.titleData-driven deep reinforcement learning for decision-making applicationsen_US
dcterms.abstractDecision-making applications have become an important part of today's competitive, knowledge-based society, which benefits many important areas. Significant progress has been made in machine learning recently, thanks to the availability of data not previously available. By learning from past data, machine learning can make better decisions than relying solely on domain knowledge. Among various machine learning algorithms, reinforcement learning (RL) is the most promising algorithm because it learns to map current conditions on decision solutions and considers the impact of current decisions on subsequent decisions. Typically, reinforcement learning learns through trial-and-error using the harvested data based on its own experience to make informed decisions. In many practical applications, since there exists a large amount of offline collected data with rich prior information, the ability to learn from big data becomes the key to reinforcement learning to solve realistic decision-making problems. Unlike traditional RL methods which interact with an online environment, learning the strategy from a fixed dataset is particularly challenging. The reasons are threefold. First, for data which are generated from the daily system operations, they are not independent and identically distributed. By training on partial dataset, an RL agent can learn the converged model which makes it reluctant to explore the remaining data and further improve the model performance. Second, without the proper understanding of underlying data distribution, an RL agent may learn a decision-making strategy that easily over-fits to the observed samples in the training set but fail to generalize well on unseen samples in the testing set. Third, an RL training process can be very unstable, when data are noisy and highly variant.en_US
dcterms.abstractIn this thesis, we have studied data-driven reinforcement learning, aiming to derive decision strategies from big data collected offline. The first contribution of this thesis comes from enabling an RL agent to learn strategies from data with repetitive patterns. To force an RL agent to fully "explore" massive data, we partition the historical big dataset into multi-batch datasets. Typically, we study in both theory and practice how an RL agent can incrementally improve the strategy by learning from the multi-batch datasets. The second contribution of this thesis comes from that we explore the underlying data property distribution under the reinforcement learning scheme. With the generative distribution, one can select the hardest (most representative) samples to train the strategy model, thus achieving a better application performance. The third contribution of the thesis comes from we apply the RL method to learn strategies from high variance data. Specifically, we bound the distribution of the parameters in the new strategy relatively close to its predecessor strategy to stabilize the training. Finally, through data-driven reinforcement learning, we thoroughly study various applications, including social analysis, dynamic resource allocation, and multi-agent pattern formation.en_US
dcterms.extentxvi, 100 pages : color illustrationsen_US
dcterms.isPartOfPolyU Electronic Thesesen_US
dcterms.issued2021en_US
dcterms.educationalLevelPh.D.en_US
dcterms.educationalLevelAll Doctorateen_US
dcterms.LCSHReinforcement learningen_US
dcterms.LCSHMachine learningen_US
dcterms.LCSHBig dataen_US
dcterms.LCSHHong Kong Polytechnic University -- Dissertationsen_US
dcterms.accessRightsopen accessen_US

Files in This Item:
File Description SizeFormat 
5553.pdfFor All Users11.53 MBAdobe PDFView/Open


Copyright Undertaking

As a bona fide Library user, I declare that:

  1. I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
  2. I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
  3. I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.

By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.

Show simple item record

Please use this identifier to cite or link to this item: https://theses.lib.polyu.edu.hk/handle/200/11095