Reading, writing, and refining for text summarization

Sun, Shichao

Author:	Sun, Shichao
Title:	Reading, writing, and refining for text summarization
Advisors:	Li, Wenjie Maggie (COMP)
Degree:	Ph.D.
Year:	2024
Subject:	Text processing (Computer science) Automatic abstracting Natural language processing (Computer science) Machine learning Hong Kong Polytechnic University -- Dissertations
Department:	Department of Computing
Pages:	xx, 188 pages : color illustrations
Language:	English
Abstract:	Text summarization is a critical task in natural language processing, where the lengthy text is condensed into a concise version that highlights the key points from the original source. In today’s fast-paced digital environment, text summarization has become a valuable tool, as it can offer a solution to information overload by facilitating quicker understanding and assimilation of knowledge. Recently, text summarization has received increasing attention from the academic community. As pre-trained language models advance, they imitate human text summarization streamline, i.e., reading, writing, and reﬁning, and have achieved impressive performance. However, these developments also bring new challenges to every stage of the text summarization streamline. Current language models struggle to understand different kinds of documents for various summarization methods, such as news articles in unsupervised extractive summarization and meeting transcripts in supervised abstractive summarization. Consequently, this limitation hinders existing summarization methods from producing concise and coherent summaries. Additionally, pretrained language models are not originally crafted with the specialized purpose of text summarization. Supervised ﬁne-tuning with summarization data suﬀers from exposure bias and cannot elicit extensive capabilities of pre-trained language models. As a result, there exists untapped potential for existing pre-trained language models to deliver even better performance. Lastly, it is worth noting that like humans, the generated summary on the ﬁrst try always contains some ﬂaws. It becomes essential to devise novel approaches aimed at reﬁning these initial summaries into superior versions. Yet, this research area still remains unexplored. In this thesis, we summarize the aforementioned challenges as three research problems to address: (1) How to improve the semantic understanding of the input document? (2) How to enhance summary writing of pre-trained language models? (3) How to reliably reﬁne the generated summary for superior quality? To tackle the challenges previously outlined, we propose a range of approaches to enhance every stage of the text summarization streamline. The body of research presented in this thesis is structured into three parts. In the ﬁrst part (work 1 and 2), we explore the problem 1 on unsupervised summarization and supervised summarization. Recent literatures on unsupervised summarization for the news article mostly needs to understand each sentence and estimate the sentence similarity between two sentences. However, sentence similarity estimation using pre-trained language models performs worse in unsupervised summarization than that using statistic vector representation like tﬁdf. In work 1, we propose to use mutual learning along with an extra signal ampliﬁer, where diﬀerent granularities of pivotal information for text summarization are mutually learned to enhance the semantic understanding. On the other hand, lengthy and complex documents, like meeting transcripts, impose severe challenges on supervised summarization. In work 2, we focus on meeting summarization, which involves condensing key information from meeting transcripts into concise textual summaries. This task is particularly challenging due to the intricate nature of multi-party interactions and the potential presence of hundreds of participants’ utterances within a single meeting. We explore and conﬁrm the viability of employing dialogue acts to improve understanding of meeting transcripts, where dialogue acts represent the functions that utterances serve within the context of dialogue-based interaction. By integrating dialogue acts, we enhance the classical extract-abstract framework for meeting summarization. In the second part (work 3 and 4), we investigate the problem 2 with the solution of contrastive learning. Supervised ﬁne-tuning using summarization data is de facto standard practice for text summarization. However, it suffers from exposure bias—a discrepancy between model training and inference. In work 3, we alleviate this issue by introducing a novel contrastive learning loss to integrate the extra negative sample into the ﬁne-tuning process, improving the model’s resilience to exposure bias. Furthermore, we observe that while a limited amount of data can yield highly promising results, excessive data can lead to performance decline, potentially due to overﬁtting. In work 4, we explore methods for data selection and data order. The goal is to determine how to maintain the efficiency and stability observed in work 3. In the third part (work 5 and 6), we solve the problem 3. Recently, a new paradigm, generating-critiquing-reﬁning, has been introduced into text generation, such as text summarization. Speciﬁcally, we can prompt large language models to draft a summary, generate critiques of this draft, and use the critiques to polish the draft further. In work 5, we study which of two prompt strategies, prompt chaining and stepwise prompt, is the better way to perform the generating-critiquing-reﬁning workﬂow for the reﬁnement in text summarization. In work 6, motivated by the intuition that superior critique can lead to better reﬁnement, we ﬁrst propose to devise the evaluation method for the critique. We validate the hypothesis that superior critique directly contributes to the production of better summary. Beyond text summarization, we also conduct extensive experiments on other text generation tasks, including entailment, reasoning, and question answering, to demonstrate that the critique can play a constructive role in polishing the outcome of extensive text generation tasks. To sum up, we conduct a comprehensive study on text summarization powered by pre-trained language models, ranging from understanding the input text, tuning the models, to reﬁning the generated summary. Through the use of our proposed methods on real-world datasets, we have illustrated the signiﬁcant improvements.
Rights:	All rights reserved
Access:	open access

Files in This Item:

File	Description	Size	Format
7644.pdf	For All Users	4.98 MB	Adobe PDF	View/Open

Copyright Undertaking

As a bona fide Library user, I declare that:

I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.

By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.

Show full item record

Please use this identifier to cite or link to this item: https://theses.lib.polyu.edu.hk/handle/200/13192