| Author: | Dong, Junnan |
| Title: | Knowledge-based question answering with large language models |
| Advisors: | Huang, Xiao (COMP) Li, Qing (COMP) |
| Degree: | Ph.D. |
| Year: | 2025 |
| Department: | Department of Computing |
| Pages: | xiv, 92 pages : color illustrations |
| Language: | English |
| Abstract: | Knowledge-based question answering (KBQA) has been extensively studied to answer questions that necessitate domain knowledge from external sources, e.g., knowledge graphs (KGs). With the emergence of large language models (LLMs), it remains important but under-explored to integrate KGs and LLMs for more advanced KBQA frameworks. In this thesis, I selectively present two effective methods that could benefit various domains with a knowledgeable and efficient KBQA based on LLMs. Previous efforts have been devoted to multi-hop reasoning over a KG (small models, namely KGMs), mainly focusing on the design of semantic parsing and path retrieval. My research begins with a novel hierarchy-aware framework, i.e., HamQA, by effectively exploiting the semantic hyponymy and structural hierarchy in both questions and KGs for complex reasoning tasks. However, this pipeline of methods often suffer from both subjective and generative questions where answers cannot be retrieved from the graph. Moreover, it remains difficult to understand and answer complex questions, e.g., negation and rhetoric. Since large language models (LLMs) have shown their remarkable capabilities in both comprehension and generation, they have brought opportunities to traditional KBQA. However, LLMs are often criticized for their tendency to produce hallucinations, in which models fabricate incorrect statements about tasks beyond their knowledge and perception. Moreover, the costs of invoking LLMs are significantly higher. Either calling the API of closed-source LLMs (token-level costs) or deploying the open-source ones locally (resources) is prohibitive. To this end, I systematically developed a series of algorithms to comprehensively combine KGs and LLMs, which could be categorized into two classes: LLM-guided KG reasoning and a cost-efficient combination. The first method, namely MAIL, leverages LLMs (and visual LLMs) to construct dual graphs, i.e., scene graph and concept graph about the key objects and events in the image and question, as well as the corresponding background knowledge. It enhances a tight inter-modal fusion while maximally preserving the insightful intra-modal information for each modality. The second method COKE assembles two sets of base models, i.e., LLMs and KGMs to balance both inferential accuracy and cost saving. It could automatically assign the most promising model for particular questions and make reliable decisions effectively considering both inferential accuracy and cost saving, making a balance of exploration and exploitation during selections. Based on which, we could both effectively and efficiently leverage the explicit and implicit knowledge in KGs and LLMs and incorporate them for various domain-specific reasoning tasks. These outcomes throughout my research have broad implications across various domains with careful and advanced designs. By enabling domain experts to leverage the complementary strengths of both KGs and LLMs, this thesis paves the way for more knowledgeable, efficient domain-specific question answering frameworks in both industrial scenarios and the research community. |
| Rights: | All rights reserved |
| Access: | open access |
Copyright Undertaking
As a bona fide Library user, I declare that:
- I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
- I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
- I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.
By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.
Please use this identifier to cite or link to this item:
https://theses.lib.polyu.edu.hk/handle/200/13973

