Full metadata record
DC FieldValueLanguage
dc.contributorDepartment of Computingen_US
dc.contributor.advisorCao, Jiannong (COMP)en_US
dc.creatorLiu, Shuaiqi-
dc.identifier.urihttps://theses.lib.polyu.edu.hk/handle/200/12810-
dc.languageEnglishen_US
dc.publisherHong Kong Polytechnic Universityen_US
dc.rightsAll rights reserveden_US
dc.titleNeural abstractive summarization for long documentsen_US
dcterms.abstractLong documents, like academic literature, financial reports, and legal instruments, are important information sources. Nowadays, people can access massive long documents through the Internet. Reading through all their acquired documents and finding their desired content would be a heavy burden. The high-quality summaries can help people quickly grasp the key information from original documents. Automatic text summarization techniques can be employed to produce concise summaries for long documents. Abstractive summarization methods can approximate how humans write summaries by capturing input documents’ salient content and generating novel sentences as summaries.en_US
dcterms.abstractIn this thesis, I study neural abstractive summarization for long documents. I aim to train neural network models to generate informative, fluent, and non-redundant summaries covering the multi-granularity, multi-document, and multimodal salient content in various long documents. Some new challenges arise in order to accomplish this objective: 1) the scarcity of available datasets, 2) identifying the multi-granularity salient information scattered in long inputs, 3) incorporating multi-document and multimodal content when generating summaries, 4) evaluating the quality of the generated summaries, 5) improving the efficiency of model training and inference. To tackle the above challenges, I built multiple large-scale datasets, novel summarization methods, and evaluation metrics, which are summarized below.en_US
dcterms.abstractFirst, I built multiple large-scale long document summarization datasets for academic literature, financial reports, and legal instruments, which can be the foundation of long document summarization research. Meanwhile, my datasets support extending long document summarization research from unimodal to multimodal, from summarizing a limited number of documents to a large number of documents.en_US
dcterms.abstractSecond, I propose a series of techniques to identify the multi-granularity salient information scattered in long documents. This thesis introduces novel attention mechanisms, category-based content alignment method, and the multistage content selection schema for identifying and encoding phrase-level, sentence-level, and segment-level salient content.en_US
dcterms.abstractBesides, my research validates the importance of jointly considering multimodal or multi-document content when summarizing long documents. This thesis proposes multiple methods incorporating salient content from text and tables into summary generation. Besides, this thesis also proposes methods to summarize multiple categories of salient content from a large number of documents and generate structured summaries.en_US
dcterms.abstractTo evaluate various summarization methods, my research not only employs commonly used automatic evaluation metrics but also proposes novel evaluation metrics. We also compare different models’ generated summaries by human evaluation.en_US
dcterms.abstractLast but not least, my research leverages various techniques to improve the efficiency of model training and inference. This thesis not only proposes efficient summarization models but also adopts some memory-efficient training methods. These techniques enable training large neural summarization models over long inputs on an of-the-shelf GPU.en_US
dcterms.abstractI hope this thesis can promote the long document summarization research. Although this thesis presents novel datasets, methods, and evaluation metrics for this topic, it still has many open problems. I list some future research directions at the end of this thesis.en_US
dcterms.extentxxi, 172 pages : color illustrationsen_US
dcterms.isPartOfPolyU Electronic Thesesen_US
dcterms.issued2024en_US
dcterms.educationalLevelPh.D.en_US
dcterms.educationalLevelAll Doctorateen_US
dcterms.LCSHComputational intelligenceen_US
dcterms.LCSHAutomatic abstractingen_US
dcterms.LCSHText processing (Computer science)en_US
dcterms.LCSHHong Kong Polytechnic University -- Dissertationsen_US
dcterms.accessRightsopen accessen_US

Files in This Item:
File Description SizeFormat 
7260.pdfFor All Users2.35 MBAdobe PDFView/Open


Copyright Undertaking

As a bona fide Library user, I declare that:

  1. I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
  2. I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
  3. I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.

By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.

Show simple item record

Please use this identifier to cite or link to this item: https://theses.lib.polyu.edu.hk/handle/200/12810