Full metadata record
DC FieldValueLanguage
dc.contributorDepartment of Applied Mathematicsen_US
dc.contributor.advisorYiu, K. F. C. (AMA)en_US
dc.creatorTang, Wai Man-
dc.identifier.urihttps://theses.lib.polyu.edu.hk/handle/200/13123-
dc.languageEnglishen_US
dc.publisherHong Kong Polytechnic Universityen_US
dc.rightsAll rights reserveden_US
dc.titlePrediction system in big data analyticsen_US
dcterms.abstractForecasting and causality are essential to decision making and resource management by relating exogenous factors or events. In addition, investment return prediction is crucial to have proper risk control and management. Nowadays, applications using advanced technologies are involved in our daily life. Big data can be collected easier in lower cost. Knowledge can be extracted to indicate important changes in the time series of data, where exogenous factors or events should fit for the purpose, as they can be instantaneous or aggregated in certain duration. Prediction and causality are some key functions in data analysis, where models can be used to extract useful features and predict data trends. Feature selection and extraction are crucial methodologies in data analysis, where sequential data is transformed into suitable features for further analysis. Relevant factors or features should be selected, which embed essential information to explain the dependent variable. This is critical to ensure useful models and accurate results.en_US
dcterms.abstractIn this thesis, our works focus on two key types of methods, they are conjoining spatio-temporal data for analysis by neural networks with deep learning, and novel factor subset selection in time-frequency representation. Applications in various aspects are studied. Chapter 2 investigates traffic speed data for multi-timestep forecasting. Congestion speed-cycle patterns of the target road segment are correlated to those of the nearby road segments. Appropriate input subset can be selected for neural network training with deep learning when input data dimensions are minimal. Chapter 3 investigates short-time Fourier Transform (STFT), where consistent patterns are used to identify factor subsets. Multi-factor model with factors in different timeframes should be more useful and practical to forecast future movements in the dynamic environment. Finally, Chapter 4 investigates wavelet transforms, and significant wavelet coefficients can be chosen as peaks by using continuous wavelet transform (CWT). Causality can be established by multiple factor models. Factor subsets are selected by factors with sample lags, which are represented by selecting appropriate wavelet coefficients in terms of both time and frequency.en_US
dcterms.extentxiii, 152 pages : color illustrationsen_US
dcterms.isPartOfPolyU Electronic Thesesen_US
dcterms.issued2024en_US
dcterms.educationalLevelPh.D.en_US
dcterms.educationalLevelAll Doctorateen_US
dcterms.LCSHBig dataen_US
dcterms.LCSHData miningen_US
dcterms.LCSHMachine learningen_US
dcterms.LCSHHong Kong Polytechnic University -- Dissertationsen_US
dcterms.accessRightsopen accessen_US

Files in This Item:
File Description SizeFormat 
7575.pdfFor All Users2.31 MBAdobe PDFView/Open


Copyright Undertaking

As a bona fide Library user, I declare that:

  1. I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
  2. I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
  3. I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.

By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.

Show simple item record

Please use this identifier to cite or link to this item: https://theses.lib.polyu.edu.hk/handle/200/13123