Abstract
Most participatory web sites collect overall ratings (e.g., five stars) of products from their customers, reflecting the overall assessment of the products. However, it is more useful to present ratings of product features (such as price, battery, screen, and lens of digital cameras) to help customers make effective purchase decisions. Unfortunately, only a very few web sites have collected feature ratings. In this paper, we propose a novel approach to accurately estimate feature ratings of products. This approach selects user reviews that extensively discuss specific features of the products (called specialized reviews), using information distance of reviews on the features. Experiments on both annotated and real data show that overall ratings of the specialized reviews can be used to represent their feature ratings. The average of these overall ratings can be used by recommender systems to provide feature-specific recommendations that can better help users make purchasing decisions.
Similar content being viewed by others
References
Zhuang L, Jing F, Zhu X (2006) Movie review mining and summarization. In: ACM 17th conference on information and knowledge management (CIKM), pp 43–50
Schafer JB, Konstan J, Riedi J (1999) Recommender systems in e-commerce. In: 1st ACM conference on electronic commerce (EC), pp 158–166
Pang B, Lee L, Vaithyanathan S (2002) Thumbs up? sentiment classification using machine learning techniques. In: Conference on empirical methods in natural language processing (EMNLP), pp 79–86
Hu M, Liu B (2004) Mining and summarizing customer reviews. In: 10th ACM international conference on knowledge discovery and data mining (KDD), pp 168–177
Talwar A, Jurca R, Faltings B (2007) Understanding user behavior in online feedback reporting. In: 8th ACM conference on electronic commerce (EC), pp 134–142
Long C, Zhang J, Huang M, Zhu X, Li M, Ma B (2009) Specialized review selection for feature rating estimation. In: Proceedings of the IEEE/WIC/ACM international conference on web intelligence (WI)
Hatzivassiloglou V, McKeown KR (1997) Predicting the semantic orientation of adjectives. In: Annual meeting of the association of computational linguistics (ACL), pp 174–181
Kamps J, Marx M (2002) Words with attitude. In: The first international conference on global WordNet, pp 174–181
Popescu AM, Etzioni O (2005) Extracting product features and opinions from reviews. In: Conference on empirical methods in natural language processing (EMNLP), pp 339–346
Dave K, Lawrence S, Pennock DM (2005) Mining the peanut gallery: Opinion extraction and semantic classification of product reviews. In: International world wide web conference (WWW), pp 519–528
Mullen T, Collier N (2004) Sentiment analysis using support vector machines with diverse information sources. In: Conference on empirical methods in natural language processing (EMNLP), pp 412–418
Pang B, Lee L (2004) A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. In: Annual meeting of the association of computational linguistics (ACL), pp 271–278
Pang B, Lee L (2005) Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In: Annual meeting of the association of computational linguistics (ACL), pp 115–124
Lu Y, Zhai C, Sundaresan N (2009) Movie review mining and summarization. In: International world wide web conference (WWW), pp 131–140
Wang H, Lu Y, Zhai C (2010) Latent aspect rating analysis on review text data: a rating regression approach. In: ACM international conference on knowledge discovery and data mining (KDD), pp 783–792
Kim SM, Pantel P, Chklovski T, Pennacchiotti M (2006) Automatically assessing review helpfulness. In: Conference on empirical methods in natural language processing (EMNLP), pp 423–430
Liu Y, Huang X, An A, Yu X (2008) Modeling and predicting the helpfulness of online reviews. In: IEEE international conference on data mining (ICDM), pp 443–452
Danescu-Niculescu-Mizil C, Kossinets G, Kleinberg J (2009) How opinions are received by online communities: A case study on amazon.com helpfulness votes. In: International world wide web conference (WWW), pp 141–150
Liu J, Cao Y, Lin CY, Huang Y, Zhou M (2006) Low-quality product review detection in opinion summarization. In: Conference on empirical methods in natural language processing (EMNLP), pp 423–430
Li F, Tang Y, Huang M, Zhu X (2004) Answering opinion questions with random walks on graphs. In: Annual meeting of the association of computational linguistics (ACL), pp 737–745
Li M, Vitányi P (1997) An introduction to Kolmogorov complexity and its applications, 2nd edn. Springer, New York
Tan P, Kumar V, Srivastava J (2002) Selecting the right interestingness measure for association patterns. In: The 8th ACM SIGKDD international conference on knowledge discovery and data mining, pp 32–44
Bennett C, Gacs P, Li M, Vitányi P, Zurek W (1998) Information distance. IEEE Trans Inf Theory 44(4): 1407–1423
Li M, Badger J, Chen X, Kwong S, Kearney P, Zhang H (2001) An information-based sequence distance and its application to whole mitochondrial genome phylogeny. Bioinformatics 17(2): 149–154
Li M, Chen X, Li X, Ma B, Vitányi P (2004) The similarity metric. IEEE Trans Inf Theory 50(12): 3250–3264
Bennett C, Li M, Ma B (2003) Chain letters and evolutionary histories. Sci Am 288(6): 76–81
Zhang X, Hao Y, Zhu X, Li M (2007) Information distance from a question to an answer. In: The 13th ACM SIGKDD international conference on knowledge discovery and data mining
Long C, Zhu X, Li M, Ma B (2008) Information shared by many objects. In: ACM 17th conference on information and knowledge management (CIKM)
Cilibrasi RL, Vitányi PM (2007) The Google similarity distance. IEEE Trans Knowl Data Eng 19(3): 370–383
Marneffe MC, MacCartney B, Manning CD (2006) Generating typed dependency parses from phrase structure parses. In: The fifth international conference on language resources and evaluation (LREC)
Lewis J, Ossowski S, Hicks J, Errami M, Garner HR (2006) Text similarity: an alternative way to search Medline. Bioinformatics 22(18): 2298–2304
Jia Y, Zhang J, Huan J (2011) An efficient graph-mining method for complicated and noisy data with real-world applications. Knowl Inf Syst (KAIS) 28(2): 423–447
Eirinaki M, Vazirgiannis M (2003) Web mining for web personalization. ACM Trans Internet Technol (TOIT) 3(1): 1–27
Saleh B, Masseglia F (2011) Discovering frequent behaviors: time is an essential element of the context. Knowl Inf Syst (KAIS) 28(2): 311–331
Becchetti L, Colesanti UM, Marchetti-Spaccamela A, Vitaletti A (2011) Recommending items in pervasive scenarios: models and experimental analysis. Knowl Inf Syst (KAIS) 28(3): 555–578
Author information
Authors and Affiliations
Corresponding author
Additional information
This work was done when C. Long was a Ph.D. student in Tsinghua University.
Rights and permissions
About this article
Cite this article
Long, C., Zhang, J., Huang, M. et al. Estimating feature ratings through an effective review selection approach. Knowl Inf Syst 38, 419–446 (2014). https://doi.org/10.1007/s10115-012-0495-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-012-0495-8