Author: | Li, Yanran |
Title: | Modeling contextual information for chit-chat conversation |
Advisors: | Li, Wenjie Maggie (COMP) |
Degree: | Ph.D. |
Year: | 2021 |
Subject: | Human-computer interaction Artificial intelligence Natural language processing (Computer science) Hong Kong Polytechnic University -- Dissertations |
Department: | Department of Computing |
Pages: | xviii, 224 pages : illustrations |
Language: | English |
Abstract: | The emergence of mobile devices and messaging applications has revolutionized the way that information propagates among individuals, and triggers the demand of virtual conversational agents for assisting and accompanying human users. This presents unprecedented challenges and opportunities and drives many researchers to study how to properly respond to the user based on a given conversation context. In this thesis, we aim at incorporating extra information like knowledge, emotion and intention into open-domain chatbots, which aims to encourage the informativeness and coherence of the generated responses. In specific, we identify three main research problems to be addressed in open-domain conversation response generation, i.e., 1. How to explore the benefits of extra information for conversation context modeling and response generation when building open-domain chatbots? 2. How to develop an effective chatbot to learn the information change through turns of conversations and consider the dependencies among them? 3. How to understand conversation context better through capturing the interactions between extra information and conversation utterances and improve conversation coherence in a holistic view? To address the aforementioned problems, we deploy several approaches based on Seq2Seq models inspired by the recent advances in neural response generation. Because conversations are inherited with discourse structures, we divide the thesis into three parts, where each part concentrates on a certain level of conversation structure. In the first part (work 1 and 2), we investigate research problem 1 under the setting of basic level of conversation, i.e., utterance-level. In order to improve utterance-level coherence and alleviate the data sparsity issue, we develop two conditional conversation models to consider knowledge and emotion information, respectively. In work 1, we focus on knowledge incorporation and utilize conversation-related knowledge to generate entity-aware responses. On two movie conversation corpus, the proposed knowledge-grounded chatbot significantly outperforms other four knowledge-grounded models. In work 2, we shift the attention to emotion-incorporation and present a conditional variational model for controlled response generation. The main idea is to introduce an external label to monitor the variable learning when conditioning the response generation on a specific attribute(s). In addition, we also propose to keep two separate dialogue contexts for each speaker in the conversation, in order to learn the speaker-aware information like personality, sentiment, styles, etc. The experimental results demonstrate that our framework is able to generate responses conditioned on specific attributes which is contributing to utterance-level coherence. The second part (work 3 and 4) explores solutions for problem 2 with the aim of improving the conversation-level coherence. Note that conversation is unique in that information will change as the conversation goes, and the information at the current state depends on both the current utterance and previous information states. Therefore in the works in this part, we put efforts to explore the dynamics of information to improve conversation-level coherence for social chatbots. Specifically, in work 3, we leverage the meta-path information and propose a meta-path-augmented chatbot which firstly compares the context vector with each of the learned meta-path vectors, and then selects the candidate entity(s) that complies with the most similar meta-path. In work 4, we identify social coherence and individual coherence as two intention factors in conversation modeling, and design two strategies to incorporate them for multi-round response generation. On two real-world multi-round conversation datasets, we demonstrate the effectiveness of the proposed approach in improving conversation-level coherence. The third part (work 5 and 6) delves into problem 3 and investigates how to improve the context-level coherence. Despite the recent improvements, the majority of existing methods learn the conversation representation and the information representation separately, which creates an obstacle for the chatbots to accurately model the conversation context and in turn influences the response quality. In the last part of our work, we argue to regard both conversation utterances and other information as a whole conversation context, and propose structured models to integrate the potential interactions among the conversation context. In work 5, we unify conversation utterances and background knowledge in one graph, and establish an innovative graph encoder to learn finer and deeper features for better response generation. In work 6, we together consider the emotion and intention states of the speakers, and propose an adversarial-augmented hierarchical model to generate responses that are sensitive to speaker states. Through extensive experiments, we verify the hypothesis that human emotions put a prior effect on conversation behavior. In summary, we study the problem of open-domain conversation modeling and response coherence in a systematic way. We demonstrate the effectiveness of the proposed approaches on real-world datasets, which implies the potentials of our works when applying in real-world scenarios, such as empathetic companions for the elderly and entertaining social chatbot. |
Rights: | All rights reserved |
Access: | open access |
Copyright Undertaking
As a bona fide Library user, I declare that:
- I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
- I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
- I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.
By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.
Please use this identifier to cite or link to this item:
https://theses.lib.polyu.edu.hk/handle/200/11355