Author: | Liu, Zhewei |
Title: | Semantic, spatial and temporal modelling of geotagged social media data for desirable region and event detection |
Advisors: | Shi, Wenzhong (LSGI) Shen, Q. P. Geoffery (BRE) |
Degree: | Ph.D. |
Year: | 2021 |
Subject: | Geographic information systems Big data Data mining Hong Kong Polytechnic University -- Dissertations |
Department: | Department of Land Surveying and Geo-Informatics |
Pages: | xix, 128 pages : color illustrations |
Language: | English |
Abstract: | The rise of Geotagged Social Media Data (GSMD) has provided new data sources and tools to investigate traditional research issues. Semantic, spatial and temporal information can be attached to GSMD, enabling human mobility patterns and urban structure to be revealed by GSMD. However, previous methods/models show limited effectiveness in analytics of GSMD, due to the complexity of GSMD's characteristics. Given above, the research objective of this thesis is to develop novel effective methods/models to handle GSMD, from three progressive perspectives: (1) semantic modelling, (2) spatial semantic modelling and (3) spatiotemporal semantic modelling. From each perspective, new models and data-handling methods are developed for tackling specific research problems and the performances are evaluated accordingly. For semantic modelling, a new hashtag network model is developed for topic modelling, and shows good performance on short social media texts. Statistical methods are traditionally used for topic modelling and geographical topic discovery. Nevertheless, statistical methods commonly require prior knowledge of the number of topics and large amounts of well-organized documents for training, which are inconsistent with the social media environment where prior knowledge is always lacking and short noisy texts predominate. Consequently, a new data-driving topic modelling method is proposed, where the hashtags attached to GSMD is used to construct network model and divided into semantic communities. For spatial semantic modelling, a new scale-concerned model and a new data-driven model are proposed respectively for predicting regional desirability. The proposed scale-concerned model is an extension of traditional Hypertext Induced Topic Search (HITS) model, with consideration of the size of the region, and predicts the regional desirability with better accuracy than previous methods. Further, a new data-driven model RegNet is proposed to predict regional desirability, using adaptive encoding-prediction structure of neural network. For spatiotemporal modelling, a new model is developed for event detection by finding spatiotemporal irregularities. The intuition is that a social event may cause irregular geographical patterns, especially irregular human mobility and interaction patterns. The proposed model thus constructs both global and local features/indicators to characterize spatial patterns of GSMD. The social events are then detected by finding feature irregularities. The experiments are conducted with real-world datasets and the results demonstrate the proposed models' effectiveness and outperformance over previous baseline methods. In sum, this thesis serves as a systemic study on modelling of GSMD from several perspectives. Particularly, it focuses on the development of new models and data handling methods by combination of semantic, spatial and temporal information attached to GSMD, for the task of topic modelling, desirable region detection and event detection. The presented works in this thesis can benefit relevant urban study by providing effective and robust data handling models/methods, and also be potentially implemented as data-processing tools for tackling practical real-world problems. |
Rights: | All rights reserved |
Access: | open access |
Copyright Undertaking
As a bona fide Library user, I declare that:
- I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
- I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
- I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.
By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.
Please use this identifier to cite or link to this item:
https://theses.lib.polyu.edu.hk/handle/200/11565