Author: Gao, Huicai
Title: Forecasting tourism demand using big data
Advisors: Li, Hengyun (SHTM)
Song, Haiyan (SHTM)
Degree: Ph.D.
Year: 2024
Subject: Tourism -- Forecasting
Tourism -- Data processing
Tourism -- Management
Hong Kong Polytechnic University -- Dissertations
Department: School of Hotel and Tourism Management
Pages: 133 pages : color illustrations
Language: English
Abstract: Accurate tourism demand forecasting is vital for policymakers and businesses seeking to detect tourism-related changes. It is also crucial for industry stakeholders to undertake rapid measures to guarantee recovery, sustainable growth, and solid commercial opportunities over the long term. As the ongoing development of Internet applications and technology, numerous Internet data reflecting tourists' behaviors are being recorded, accumulated, and produced. Various Internet big data have thus been employed to forecast tourism demand, such as social media data, search intensity indices (SII) data, and web traffic data. As an important supplement to traditional data sources, scholarly research on tourism forecasting with Internet big data has proliferated over the past several decades and shows no signs of abating.
This thesis explores how to utilize different Internet big data for high-efficiency tourism demand prediction. The general process of a model prediction begins with comprehensive variable input, followed by valuable variable extraction, and finally by an effective forecasting method. Three Internet big data–based studies are carried out progressively in this thesis, each contributing to different stages of model prediction. Study 1 focuses on the variable input stage by mining fine-grained sentiment variables from social media data for tourism demand prediction. Subsequently, Study 2 focuses on the variable extraction stage by introducing a novel inter- and intra-feature relation–based multi-feature fusion method for tourism demand prediction. Ultimately, Study 3 focuses on the forecasting method stage by introducing a novel time- and feature-varying ensemble learning–based combination forecasting model for tourism demand prediction. Tourist arrivals predictions for tourist attractions (i.e., Jiuzhaigou Valley and Kulangsu Island, China) and destinations (i.e., Hong Kong and Sanya City, China) are carried out. Diverse time series models, machine learning models, and deep learning models are applied for tourism demand forecasting in this thesis.
Study 1 utilizes fine-grained sentiment analysis to analyze tourists' preferences from social media data for tourism demand prediction. Generic sentiment calculations of tourist online reviews cannot fully reflect tourists' preferences, whereas aspect-based sentiment analysis (ABSA) identifies tourists' precise preferences. On the basis of fine-grained aspect-level sentiment analysis and hybrid feature engineering (FE), this study forecasts tourist arrivals at Kulangsu Island and Jiuzhaigou Valley in China using Internet data from different sources (i.e., SII data, official announcement data, and online review data). Empirical results show that 1) fine-grained sentiment analysis of online review data can substantially improve tourism demand models' forecasting performance; 2) combining multidimensional sentiment analysis–based online review data with SII data outperforms SII data in tourism demand prediction; and 3) fine-grained sentiment analysis–based online review data and SII data maintain stable predictive power during times of uncertainty.
Study 2 introduces a novel inter- and intra-feature relation–based multi-feature fusion model. Although various FE approaches have been utilized to extract valuable features for predicting tourism demand, the interactive inter- and intra-feature relations from Internet big data, especially from multisource Internet big data, have been largely ignored by forecasting models. On the basis of SII data and structured online review data (e.g., ratings), Study 2 forecasts the tourist arrivals at Kulangsu Island and Jiuzhaigou Valley in China. Empirical results show that the proposed deep network–based FE method surpasses most typical FE methods in tourism demand forecasting. Meanwhile, incorporating interactively inter- and intra-feature relations from multisource Internet big data into forecasting model can remarkably enhance forecasting accuracy.
Study 3 introduces an innovative approach that employs a time- and feature-varying ensemble learning–based meta-learner to consolidate individual model forecasts. Choosing appropriate weights for individual models represents a major challenge in combination forecasting. Most research has used constant or equal weights, ignoring dynamic weights that take into account data and temporal patterns. The proposed model integrates statistical, machine learning, and deep learning models, along with SII data, to forecast tourist arrivals at Hong Kong and Sanya City, China. Results show that the proposed model surpasses most individual models and typical combination methods in stable and uncertain times. The findings also highlight the proposed model's ability to yield consistent and reliable predictions across a variety of scenarios, particularly during volatile periods.
This thesis contributes to the tourism demand forecasting literature and methodologies in a few ways. It enhances the variable input of tourism demand forecasting literature by demonstrating the capacity of review-level fine-grained sentiment variables to predict tourism demand more precisely. It enhances the variable extraction of tourism demand forecasting literature by demonstrating the capacity of inter- and intra-feature relations from multisource Internet big data to predict tourism demand more accurately. It enhances the forecasting methods of combination tourism demand forecasting literature by demonstrating the capacity of dynamic weights to predict tourism demand more precisely. Additionally, it also makes an initial attempt to examine the resilience of SII data, social media data, and multisource Internet big data for tourism demand prediction during turbulent periods. Practically, utilizing the proposed methods in this work to generate precise tourism demand prediction will provide tourism stakeholders with a valuable aid for effective resource planning and efficient operation management. Meanwhile, by integrating SII data, ABSA–based social media data, or multisource Internet big data into forecasting models, the predictions will also enable tourism managers and policymakers to make informed judgments in unforeseen circumstances such as COVID-19.
Rights: All rights reserved
Access: open access

Files in This Item:
File Description SizeFormat 
7785.pdfFor All Users8.68 MBAdobe PDFView/Open


Copyright Undertaking

As a bona fide Library user, I declare that:

  1. I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
  2. I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
  3. I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.

By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.

Show full item record

Please use this identifier to cite or link to this item: https://theses.lib.polyu.edu.hk/handle/200/13391