Author: Li, Yongqi
Title: Towards interactive information seeking : conversational question answering
Advisors: Li, Wenjie Maggie (COMP)
Degree: Ph.D.
Year: 2024
Subject: Question-answering systems
Intelligent agents (Computer software)
Natural language generation (Computer science)
Hong Kong Polytechnic University -- Dissertations
Department: Department of Computing
Pages: xvi, 185 pages : color illustrations
Language: English
Abstract: In recent years, the rise of machine learning techniques has accelerated the development of conversational agents. These conversational agents provide a natural and convenient way for people to chit-chat, complete well-specified tasks, and seek information in their daily lives. People often prefer to ask conversational questions when they have complex information needs or are of interest to certain broad topics. Although current personal assistant systems are capable of completing tasks and even conducting small talk, they cannot handle information-seeking conversations with complicated information needs that require multiple turns of interaction. It is, therefore, essential to endow conversational agents with the capability of answering conversational questions, which introduces a broad and new research area, namely conversational question answering (CoQA).
Given many advantages of CoQA, however, it has proven to be significantly more challenging compared with traditional QA. Specifically, we identify three main research problems to be addressed in conversational QA, i.e., 1. How to explore the dynamic interaction of information content and communication intent for thorough conversation context modeling? 2. How to accurately identify users’ information needs within conversations and retrieve relevant documents from the web accordingly? 3. How to enrich the textual conversational QA with multimodal evidence, like images and tables, and effectively model the complex relations among multimodal items?
To address the aforementioned problems, we focus on developing effective conversational QA systems from the following three aspects: multiple conversation flow tracking, generative document retrieval, and multimodal knowledge enhancement. In accordance, the research works in this thesis are organized into three parts.
In the first part (work 1 and 2), we investigate research problem 1, which is the core research problem of conversational QA. We innovatively model the intricate interactions among conversations into multiple conversation flows to enhance the question answering. In work 1, we delve into leveraging the conversation context to aid in locating answers in the current-turn. Due to the coherent nature of conversational questions, the corresponding answers tend to be related and organized within logically connected passages on the web. We claim this coherence among answers as answer flow and utilize it to improve the current-turn QA. In work 2, we take the concept of flow in conversational question answering to a new level. We introduce three distinct conversation flows, namely question flow, topic flow, and answer flow. These flows are proposed and utilized to enhance the current-turn Question-Answering (QA) process by considering multi-level information transitions.
The second part (work 3, 4, and 5) explores solutions for problem 2 with the aim of more effective passage retrieval in conversational QA. In this part, we explore a new retrieval paradigm, generative retrieval, and demonstrate its effectiveness in the conversational setting. In work 3, we first propose multiview identifiers enhanced generative retrieval framework, which unifies previous generative retrieval methods in one framework and achieves state-of-the-art performance. In work 4, we further enhance the proposed framework in work 3 by bridging it with the classical learning-to-rank paradigm. In work 5, we apply the previously proposed generative retrieval method to the conversational QA task and demonstrate the excellent superiority of generative retrieval in conversational QA.
The third part (work 6) delves into problem 3 and investigates the multimodal conversational QA problem. Previous conversational QA systems usually rely on one specific knowledge source, which overlooks the visual evidence and does not take the multimodal knowledge sources into account. In work 6, we hence define a novel research task, i.e., multimodal conversational question answering (MMCoQA), aiming to answer users’ questions with multimodal knowledge sources via multi-turn conversations. This new task brings a series of research challenges, including but not limited to priority, consistency, and complementarity of multimodal knowledge. To facilitate the data-driven approaches in this area, we construct the first multimodal conversational QA dataset, named MMConvQA, and introduce a multimodal conversational QA model.
In summary, we study the problem of conversational QA in a systematic way. We demonstrate the effectiveness of the proposed approaches on real-world datasets, which implies the potential of our works when applied in real-world scenarios.
Rights: All rights reserved
Access: open access

Files in This Item:
File Description SizeFormat 
7339.pdfFor All Users8.21 MBAdobe PDFView/Open


Copyright Undertaking

As a bona fide Library user, I declare that:

  1. I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
  2. I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
  3. I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.

By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.

Show full item record

Please use this identifier to cite or link to this item: https://theses.lib.polyu.edu.hk/handle/200/12892