Exploring text summarization beyond news articles

Yuan, Ruifeng

Full metadata record

DC Field	Value	Language
dc.contributor	Department of Computing	en_US
dc.contributor.advisor	Li, Wenjie Maggie (COMP)	en_US
dc.creator	Yuan, Ruifeng	-
dc.identifier.uri	https://theses.lib.polyu.edu.hk/handle/200/13041	-
dc.language	English	en_US
dc.publisher	Hong Kong Polytechnic University	en_US
dc.rights	All rights reserved	en_US
dc.title	Exploring text summarization beyond news articles	en_US
dcterms.abstract	Text summarization has been an important task for natural language processing. It aims to compress the source document(s) into a more concise version that covers the most important information. In recent years, with the development of the neural-based language model, text summarization has made great progress. In this process, news summarization is undoubtedly the most important research topic in this field. On the one hand, news summaries have inherent application scenarios in real life. On the other hand, a set of large-scale news summarization datasets has been proposed to meet the data requirement of neural models. Therefore, for a considerable period, researching general summarization models on news data has become the mainstream paradigm in text summarization. With the continuous advancement of text summarization models and techniques, researchers are no longer confined to such a paradigm but are further exploring or rediscovering more diverse text summarization problems. These problems often have their unique characteristics, which means the general approaches cannot be blindly applied. Meanwhile, they still share similarities with it in many ways. In this thesis, we aim to investigate text summarization problems beyond the ”news data + general model” mainstream paradigm. More specifically, we identify three research problems to be addressed: 1. How to use the natural features of news articles to improve the general summarization model on news summarization? 2. How to extend the current general summarization models to summarization tasks/domains where these models cannot be directly applied? 3. How to utilize the large-scale data in news summarization to assist summarization tasks/domains with insufficient data?	en_US
dcterms.abstract	To address the aforementioned problems, we aim to develop summarization approaches for specific domains or tasks. Based on the three proposed research questions, the thesis is naturally divided into three parts.	en_US
dcterms.abstract	In the first part (work 1 and work 2), to enhance the news summarization with its unique characteristics, we propose to incorporate a typical kind of extra information into the summarization model, the event-level information. The research target here is to investigate what role event-level information plays in both extractive and abstractive news summarization, and how to make good use of them. In work 1, we propose to extract event-level semantic units for better extractive news summarization. We also introduce a hierarchical structure, which incorporates the multi-level of granularities of the textual information into the model. In work 2, we explore the effective sentence fusion approach that can fuse extracted salient information to abstractive summary sentences. We propose to build an event graph from the input sentences to effectively capture and organize related events in a structured way and use the constructed event graph to guide the summarization. In addition to making use of the attention over the content of sentences and graph nodes, we further develop a graph flow attention mechanism to control the fusion process via the graph structure for the faithfulness of the fused summaries. The experiments and further ablation studies on news datasets demonstrate the effectiveness of event-level information in news summarization.	en_US
dcterms.abstract	The second part (work 3) aims to explore text summarization problems that can not directly apply general summarization models, where the most representative one is long-input summarization. As general summarization models struggle with long-length input because of their high memory cost, these models cannot directly apply to documents with thousands of tokens. The main challenge is how to effectively extend mature summarization techniques and efficiently handle the difficulties brought by the long input. In work 3, we present a context-aware extract-generate framework (CAEG) for long-input text summarization. It focuses on preserving both local and global context information in an extract-generate framework with little cost. CAEG generates a set of context-related text spans called context prompts for each text snippet and uses them to transfer the context information from the extractor and generator. To find such context prompts, we propose to capture the context information based on the interpretation of the extractor, where the text spans having the highest contribution to the extraction decision is considered as containing the richest context information. The experiments show the effectiveness and efficiency of our model in capturing and preserving the context information in the long-input summarization.	en_US
dcterms.abstract	The third part (work 4 and work 5) delves into problem 3 and investigates how to effectively transfer the knowledge of summarization learned from news data to tasks or domains with insufficient data. Work 4 explores this problem from the perspective of task knowledge transferring in the context of query-focused summarization. In this work, we investigate the idea of whether we can integrate and transfer the knowledge of news summarization and question answering to assist the few-shot learning in query-focused summarization. Here, we propose prefix-merging, a prefix-based pre-training strategy for few-shot learning in query-focused summarization. We integrate the task knowledge from text summarization and question answering into a properly designed prefix and apply the merged prefix to query-focused summarization. In addition to task knowledge transfer, we also investigate domain transfer of extractive summarization in work 5. In text summarization, context information is considered as a key factor. Meanwhile, there also exist other pattern factors that can identify sentence importance, such as sentence position or certain n-gram tokens. In this work, we attempt to apply disentangled representation learning on extractive summarization, and separate the two key factors for the task, context and pattern, for a better generalization ability in the low-resource setting. The experiments suggest the great potential of the knowledge contained in the large-scale news summarization data in improving the summarization system in other tasks or domains.	en_US
dcterms.abstract	As a conclusion, we study our proposed research problems of text summarization in a systematic way. We illustrate the importance of these problems and demonstrate the effectiveness of our approaches on various datasets. This shows the potential of our works to benefit real-world applications, such as news summarization, academic research and medical records collection.	en_US
dcterms.extent	xx, 172 pages : color illustrations	en_US
dcterms.isPartOf	PolyU Electronic Theses	en_US
dcterms.issued	2024	en_US
dcterms.educationalLevel	Ph.D.	en_US
dcterms.educationalLevel	All Doctorate	en_US
dcterms.LCSH	Automatic abstracting	en_US
dcterms.LCSH	Electronic information resources -- Abstracting and indexing	en_US
dcterms.LCSH	Natural language processing (Computer science)	en_US
dcterms.LCSH	Hong Kong Polytechnic University -- Dissertations	en_US
dcterms.accessRights	open access	en_US

Files in This Item:

File	Description	Size	Format
7493.pdf	For All Users	3.84 MB	Adobe PDF	View/Open

Copyright Undertaking

As a bona fide Library user, I declare that:

I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.

By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.

Show simple item record

Please use this identifier to cite or link to this item: https://theses.lib.polyu.edu.hk/handle/200/13041