Author: | Wu, Mingli |
Title: | Investigations on temporal-oriented event-based extractive summarization |
Degree: | Ph.D. |
Year: | 2008 |
Subject: | Hong Kong Polytechnic University -- Dissertations. Automatic abstracting. Computational linguistics. |
Department: | Department of Computing |
Pages: | xiii, 151 leaves : ill. ; 30 cm. |
Language: | English |
Abstract: | Automatic summarization aims to produce a concise summary of source documents by identifying the focused topics in documents. Normally, topics are represented by some essential events. Topics may evolve or shift over time. Tracking the trend of the topics requires anchoring events on the time line. Unfortunately, both events and their associated time features are not well studied in previous work. Investigating event-based and temporal-oriented summarization techniques are primary objectives of this study. As a matter of fact, the salience of contents could hardly be evaluated from single point of view. Exploiting a framework which can effectively integrate multiple impact factors is another objective. We define events by "action" words as well as associated named entities. Events weave documents into a map built either on event instances or event concepts. Relevance between events is exploited to identify important events. To utilize temporal information associated to events, it is necessary to extract and normalize temporal expressions. We investigate rule-based approaches for these tasks. Two statistical measures are employed to evaluate the significance of events based on their temporal distributions. Sentence selection is a complicated process. Therefore we explore various features including surface, content, event and relevance features under a learning-based classification framework. Event-based and temporal-oriented approaches are incorporated as features into this framework. The contributions of this study are listed as follows. (1) Event-based summarization approaches are proposed. They achieve competitive results when compared with successful word-based approaches. (2) Temporal concepts are introduced into event-based summarization and temporal information is found crucial to summarization on documents which contain evolving topics. (3) An adaptive leaning-based framework is developed to incorporate various types of features. (4) A system for temporal expression extraction and normalization is implemented. It is an effective tool not only practical for document summarization, but also for many other applications. |
Rights: | All rights reserved |
Access: | open access |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
b21900346.pdf | For All Users | 1.95 MB | Adobe PDF | View/Open |
Copyright Undertaking
As a bona fide Library user, I declare that:
- I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
- I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
- I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.
By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.
Please use this identifier to cite or link to this item:
https://theses.lib.polyu.edu.hk/handle/200/1362