Author: | Lin, Zehang |
Title: | Learning robust multimodal representation for event detection from social media data |
Advisors: | Li, Qing (COMP) |
Degree: | Ph.D. |
Year: | 2025 |
Department: | Department of Computing |
Pages: | xvi, 142 pages : color illustrations |
Language: | English |
Abstract: | The widespread use of social media has generated a vast amount of data, which presents unique challenges and opportunities for information processing. This massive data, characterized by its scale and complexity, demands advanced analytics to fully utilize its potential. Within this scenario, social event detection emerges as a critical analytics task, which aims at identifying and categorizing significant events from the streams of data available on social media platforms. However, the social media data used for event detection exhibit characteristics of multimodality, information fragmentation, cross-platform, and dynamic nature. The performance of current social event detection methods is hindered by two major problems. The first problem is the limited detection accuracy. Despite the rapid advancement of deep learning methods, they face various challenges in handling modality heterogeneity inherent in multimodal social media event data and the out-of-distribution (OOD) problem caused by information fragmentation. Existing methods, although starting to leverage multimodal data for event detection, often struggle to identify the correct events when faced with fragmented information. The second problem is the insufficient generalization capability. Current supervised event detection methods have limited generalization capability when dealing with different data sources and newly emerging events. Due to the cross-platform and dynamic nature of social event data, the lack of consideration for these aspects affects the generalizability of event detection models. To address these problems above, our focus in this thesis is on the following objectives. Firstly, we aim to design a deep learning model to address modality heterogeneity and the OOD problem, thereby improving the accuracy of event detection. Secondly, we aim to develop an innovative manner to adapt models to implement cross-platform social event detection. Thirdly, we aim to extend existing supervised event detection methods to discover new social events in social media. To achieve the first objective, we introduce a Multimodal Fusion with External Knowledge (MFEK) model. This method incorporates a text enrichment module that leverages image semantics to enhance textual content, along with a knowledge-aware feature fusion mechanism that effectively integrates external knowledge and multimodal data to mitigate modality heterogeneity and the OOD problem caused by the fragmentation of social event data. We find that such a method can bring a significant improvement to the performance after incorporating external knowledge, even in scenarios with fragmentation information. To accomplish the second objective, we develop a Self-Supervised Modality Complementation (SSMC) method to enhance the model's adaptability and performance across different social media platforms. By introducing a Missing Data Complementation (MDC) module and a Multimodal Self-Learning (MSL) module, SSMC effectively addresses incomplete modalities and platform heterogeneity in the scenario of cross-platform event detection. We find that such a strategy ensures robust cross-platform event detection even in the presence of varied and incomplete data. In addition, we validate the role of cross-platform event detection in improving the quality of single-platform event data. For the third objective, we propose a new task, generalized social event detection, which requires accurately identifying predefined events and detecting emerging new events. Specifically, we propose a Dynamic Augmentation and Entropy Optimization (DAEO) model, which utilizes adversarial learning for learning robust multimodal representation and introduces an adaptive entropy optimization technique with a self-distillation method that promotes model adaptability to newly emerging events. We demonstrate that this combination allows for the effective identification of both known and new events, thereby enhancing the model's generalization capabilities. To summarize, in this thesis, we propose a MFEK model by introducing external knowledge to improve the accuracy of social event detection. Furthermore, we develop a SSMC method to enhance cross-platform adaptability and a DAEO model to tackle generalized social event detection, thereby addressing key challenges in multimodal social event detection and improving overall model performance and generalization. Extensive experiments conducted on publicly available and our collected real-world datasets demonstrate their significance in the context of social event detection, outperforming the state-of-the-art baseline approaches. |
Rights: | All rights reserved |
Access: | open access |
Copyright Undertaking
As a bona fide Library user, I declare that:
- I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
- I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
- I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.
By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.
Please use this identifier to cite or link to this item:
https://theses.lib.polyu.edu.hk/handle/200/13656