Author: Zhao, Lihan
Title: Incorporating total probability function in TF-IDF analysis for hotel reviews
Degree: M.Sc.
Year: 2013
Subject: Data mining.
User-generated content -- Research.
Hotels.
Hong Kong Polytechnic University -- Dissertations
Department: Department of Computing
Pages: 67 leaves : ill. ; 30 cm.
Language: English
Abstract: This thesis has originally incorporated the total probability function for the task of sentiment classification for choosing hotel reviews. The total probability function links the TF-IDF analysis from the text presentation with the probabilities assigned from the emotional dictionary. This is to get weights for text which then reflected in a better vector space model for sentiment classification through support vector machine. We provided a comprehensive review for sentiment classification, addressed the limitation of the method of feature selection and gave a introduction of the total probability function. Then we specifically demonstrated the procedure of sentiment classification and designed three methods to pursue the better performance for SVM. The data sets came from the hotel reviews and the emotional dictionary adopted in the process were grouped into six groups, based on the part of the speech. Three approaches were designed to acquire the new weights based on TF-IDF. Furthermore, we compared these methods for accuracy with each other. The testing and training time with other methods for sentiment classification were compared with these three methods. Ultimately, the integration model (TF-IDF plus total probability function plus the approach of part of speech mode combination) achieved best performance in support vector machine for sentiment classification of hotels.
Rights: All rights reserved
Access: restricted access

Files in This Item:
File Description SizeFormat 
b25786520.pdfFor All Users (off-campus access for PolyU Staff & Students only)1.02 MBAdobe PDFView/Open


Copyright Undertaking

As a bona fide Library user, I declare that:

  1. I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
  2. I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
  3. I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.

By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.

Show full item record

Please use this identifier to cite or link to this item: https://theses.lib.polyu.edu.hk/handle/200/6898