Neural abstractive summarization for long documents

Liu, Shuaiqi

Full metadata record

DC Field	Value	Language
dc.contributor	Department of Computing	en_US
dc.contributor.advisor	Cao, Jiannong (COMP)	en_US
dc.creator	Liu, Shuaiqi	-
dc.identifier.uri	https://theses.lib.polyu.edu.hk/handle/200/12810	-
dc.language	English	en_US
dc.publisher	Hong Kong Polytechnic University	en_US
dc.rights	All rights reserved	en_US
dc.title	Neural abstractive summarization for long documents	en_US
dcterms.abstract	Long documents, like academic literature, financial reports, and legal instruments, are important information sources. Nowadays, people can access massive long documents through the Internet. Reading through all their acquired documents and finding their desired content would be a heavy burden. The high-quality summaries can help people quickly grasp the key information from original documents. Automatic text summarization techniques can be employed to produce concise summaries for long documents. Abstractive summarization methods can approximate how humans write summaries by capturing input documents’ salient content and generating novel sentences as summaries.	en_US
dcterms.abstract	In this thesis, I study neural abstractive summarization for long documents. I aim to train neural network models to generate informative, fluent, and non-redundant summaries covering the multi-granularity, multi-document, and multimodal salient content in various long documents. Some new challenges arise in order to accomplish this objective: 1) the scarcity of available datasets, 2) identifying the multi-granularity salient information scattered in long inputs, 3) incorporating multi-document and multimodal content when generating summaries, 4) evaluating the quality of the generated summaries, 5) improving the efficiency of model training and inference. To tackle the above challenges, I built multiple large-scale datasets, novel summarization methods, and evaluation metrics, which are summarized below.	en_US
dcterms.abstract	First, I built multiple large-scale long document summarization datasets for academic literature, financial reports, and legal instruments, which can be the foundation of long document summarization research. Meanwhile, my datasets support extending long document summarization research from unimodal to multimodal, from summarizing a limited number of documents to a large number of documents.	en_US
dcterms.abstract	Second, I propose a series of techniques to identify the multi-granularity salient information scattered in long documents. This thesis introduces novel attention mechanisms, category-based content alignment method, and the multistage content selection schema for identifying and encoding phrase-level, sentence-level, and segment-level salient content.	en_US
dcterms.abstract	Besides, my research validates the importance of jointly considering multimodal or multi-document content when summarizing long documents. This thesis proposes multiple methods incorporating salient content from text and tables into summary generation. Besides, this thesis also proposes methods to summarize multiple categories of salient content from a large number of documents and generate structured summaries.	en_US
dcterms.abstract	To evaluate various summarization methods, my research not only employs commonly used automatic evaluation metrics but also proposes novel evaluation metrics. We also compare different models’ generated summaries by human evaluation.	en_US
dcterms.abstract	Last but not least, my research leverages various techniques to improve the efficiency of model training and inference. This thesis not only proposes efficient summarization models but also adopts some memory-efficient training methods. These techniques enable training large neural summarization models over long inputs on an of-the-shelf GPU.	en_US
dcterms.abstract	I hope this thesis can promote the long document summarization research. Although this thesis presents novel datasets, methods, and evaluation metrics for this topic, it still has many open problems. I list some future research directions at the end of this thesis.	en_US
dcterms.extent	xxi, 172 pages : color illustrations	en_US
dcterms.isPartOf	PolyU Electronic Theses	en_US
dcterms.issued	2024	en_US
dcterms.educationalLevel	Ph.D.	en_US
dcterms.educationalLevel	All Doctorate	en_US
dcterms.LCSH	Computational intelligence	en_US
dcterms.LCSH	Automatic abstracting	en_US
dcterms.LCSH	Text processing (Computer science)	en_US
dcterms.LCSH	Hong Kong Polytechnic University -- Dissertations	en_US
dcterms.accessRights	open access	en_US

Files in This Item:

File	Description	Size	Format
7260.pdf	For All Users	2.35 MB	Adobe PDF	View/Open

Copyright Undertaking

As a bona fide Library user, I declare that:

I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.

By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.

Show simple item record

Please use this identifier to cite or link to this item: https://theses.lib.polyu.edu.hk/handle/200/12810