|Title:||AMCOR : an efficacious algorithm for mining cross-object relationship among very large databases over the internet|
Hong Kong Polytechnic University -- Dissertations
Department of Computing
|Pages:||iv, 77 leaves : ill. ; 31 cm|
|Abstract:||The aim of this project is to propose an efficacious algorithm for mining cross-object relationship (AMCOR) among very large databases over the Internet. These relations are defined by association rules that relate different items in different transactions that are sparsely stored in different geographical locations. The motivation is formulate the measures so that the proposed data-mining algorithm would work effectively and efficiently for mining cross-object relationships in a real-time manner. This project is motivated by the fact that the technology for mining association rules that relate objects from geographically different sources are still rudimentary. The novelty of the AMCOR approach is that conceptually a transaction is divided longitudinally rather than latitudinally like most of the related previous work. The three objectives to be achieved in this project include: a) Propose and develop an algorithm for real-time mining of those association rules that defines cross-object relationships, in a longitudinal manner and over the Internet. b) Verily this algorithm in a stable Internet-based environment. c) Generalize the verified algorithm for wider applications. The association rules that we consider are in the form of X => A, where symbol => marks the relationship between the two large itemsets X and A. An itemset is large provided that its count is greater than and equal to the minimum support count. When X => A holds, it does not necessarily mean that X+Y => A holds because the latter may not have minimum support count. Similarly, for X => Y and Y => Z, it does not necessarily mean that X => Z holds (transitive) because the latter may not have minimum confidence. Therefore, we need to find all the large itemsets and pool them together before the decision can be made. In the distributed cases, exemplified by the Internet, the communication costs among different sites are high. The mining overhead for the AMCOR approach can be generalized as: mining overhead (MO) = computation cost (CC) + communication overhead (CMO) + house keeping overhead (HO). The test results with the Java-based mobile-agent platform: Aglets, over the Internet have shown that the AMCOR approach is indeed a viable solution for mining association rules in a longitudinal fashion. The performance of this approach improves with larger databases and larger number of mining agents. This phenomenon is the result of the communication-to-computation ratio, which is, in fact, affected by the degree of overlapped parallelism among all the collaborating asynchronously parallel agents. Therefore, the suggested immediate future work should be how to improve such overlapped parallelism.|
Files in This Item:
|b15176691.pdf||For PolyU Staff & Students||3.16 MB||Adobe PDF||View/Open|
As a bona fide Library user, I declare that:
- I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
- I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
- I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.
By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.
Please use this identifier to cite or link to this item: