Author: Leung, Wing-ki Cane
Title: Enriching user and item profiles for collaborative filtering : from concept hierarchies to user-generated reviews
Degree: Ph.D.
Year: 2009
Subject: Hong Kong Polytechnic University -- Dissertations.
Recommender systems (Information filtering)
User interfaces (Computer systems)
Department: Department of Computing
Pages: xiv, 196 p. : ill. ; 30 cm.
Language: English
Abstract: Collaborative Filtering (CF) is a recommender systems technique that generates personalized recommendations for users based on user preferences. Such preferences are usually expressed in the form of numerical ratings, or binary votes such as purchase data. Despite its considerable success and popularity in both research and practice, CF suffers from the problems of data sparseness and cold-start recommendation, which is an extreme form of data sparseness. Specifically, CF algorithms have difficulty with generating reliable recommendations when data are sparse, and they cannot recommend items that have not received any ratings from users. This thesis addresses the problems of data sparseness and cold-start recommendation of CF along two dimensions. Firstly, we developed two novel recommendation algorithms based on association rule mining techniques. The proposed algorithms, namely FARAMS and CLARE, exploit the relationships between items that are encoded in the concept hierarchies of the items when users' preference data are too limited for generating recommendations. Specifically, FARAMS makes use of interesting associations between item categories to find recommendable items for users having limited known preferences, while CLARE generates recommendations for a given cold-start item by finding other items in the system that are highly correlated with the attributes of the cold-start item. We evaluated both algorithms based on widely adopted benchmarking datasets of CF. Results show that both algorithms outperform related algorithms in addressing data sparseness and the cold-start problem under similar experimental settings. Secondly, we investigated the use of user-generated reviews for generating personalized recommendations. We made three major contributions in this area. First, we collected and analyzed a set of movie reviews to understand how user opinions are expressed in user-generated reviews, which are free-form texts written in natural language. Based on the results of our analysis, we proposed a novel method for determining the sentimental orientations and strength of user opinions. Second, we proposed a rating inference framework, namely PREF, for augmenting ratings for CF. PREF aims at determining and representing the overall sentiments expressed in reviews as numerical ratings that can readily be used by existing CF algorithms. In other words, PREF enables existing CF algorithms to utilize textual reviews as an additional source of user preferences, thereby lessens the problem of data sparseness. Third, we found that user-generated reviews contain valuable information for constructing the interest profiles of users and domain items based on a real-world dataset of tourist attraction reviews. Using such information for generating personalized recommendations significantly improve the prediction quality and coverage of traditional CF algorithms. While existing CF algorithms operate on numerical ratings or binary votes of items, our research represents an important pioneering step towards a novel CF paradigm based on user-generated reviews.
Rights: All rights reserved
Access: open access

Files in This Item:
File Description SizeFormat 
b23064365.pdfFor All Users2.19 MBAdobe PDFView/Open


Copyright Undertaking

As a bona fide Library user, I declare that:

  1. I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
  2. I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
  3. I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.

By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.

Show full item record

Please use this identifier to cite or link to this item: https://theses.lib.polyu.edu.hk/handle/200/4230