Mining of stock data : finding similar stock patterns via clustering

Pao Yue-kong Library Electronic Theses Database

Mining of stock data : finding similar stock patterns via clustering


Author: Chan, Yun-pong Zack
Title: Mining of stock data : finding similar stock patterns via clustering
Degree: M.Sc.
Year: 2010
Subject: Hong Kong Polytechnic University -- Dissertations
Stocks -- China -- Hong Kong
Stock exchanges -- China -- Hong Kong
Database management
Database searching
Department: Dept. of Computing
Pages: ix, 71 leaves : ill. ; 31 cm.
InnoPac Record:
Abstract: Knowledge discovery in Database (KDD) is now a well known technology integrating databases, artificial intelligence, and machine learning. Its objective has been described as nontrivial extraction of implicit, previously unknown, but potentially useful and interesting knowledge from enormous amount of data. Because of increasing use of temporal data in various developments and researches in the field of data mining, time series becomes an important class of temporal data objects. Thus, there is a must to research and develop methods for temporal data mining. Although the whole repository market has been dominated by relational database, there is still a large demand for new database technology to support up-to-date data processing applications, such as that for time series data. Mining of time series data is non-trivial because time series data is characterized by their numerical values and continuous nature, and it is certainly not straightforward to manipulate it. So, if it can be transformed appropriately, interesting patterns could discover and mining of those data would become an easy task. Because of this reason, it is suggested to make the basic time series unit consistent and representative. We term this process as data transformation and regard it as one of the essential components in time series data mining systems. In this thesis, two algorithms for data transformation are proposed, which are normalization by scaling and min-max normalization. Normalization by scaling performs a linear transformation on the original data, while min-max normalization keeps the relationships among the original data values. Based on the transformed data, methods for finding the similarity between stocks data are developed. They are based on the well-known k-means algorithm to find similar data patterns. With the time series pattern matching scheme introduced, we propose to address the time series segmentation problem in a more flexible way. Experimental results on the time series of various Hong Kong stocks are reported and used to illustrate the effectiveness of the proposed methods.

Files in this item

Files Size Format
b2352652x.pdf 3.686Mb PDF
Copyright Undertaking
As a bona fide Library user, I declare that:
  1. I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
  2. I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
  3. I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.
By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.


Quick Search


More Information