摘要
协同过滤是目前电子商务推荐系统中广泛使用的、最成功的推荐算法,但面临用户评分数据稀疏性问题的挑战。在介绍用户偏好数据获取途径的基础上,将稀疏性改善技术归纳为六类,包括设定缺省值、结合基于内容的过滤、降维、图论方法、基于项目评分预测以及增加用户-系统交互,重点评述了各类算法的研究情况并对六类技术进行了比较,最后探讨了该领域的未来研究方向。
Today's e-commerce environment has drastically evolved in order to cope with information overload problems.Recommendation systems are currently used as virtual salespersons to help customers quickly locate personalized information and efficiently make purchase decisions.This technology compares the shopping behaviors and interests of users having common tendencies,and then recommends products and services for users to purchase.The more ratings on products and service the system can collect,the more accurately the system can recommend appropriate products and services to customers.However,with the ever increasing number of shoppers and products sold,the ratings based on user-item matrix have quickly grown into becoming a higher-dimensional matrix.As a result,user ratings are sparsely distributed and usually have lower than 1%.The increasing sparseness of problems has severely influenced the recommendation quality of collaborative filtering system.In Section 1,we provide an overview of the importance of user preference data.User preference data are fundamental to any e-commerce recommendation systems.The preference data include explicit ratings and implicit ratings.Explicit ratings are ratings submitted manually by users about their personal preference.Implicit ratings are ratings automatically captured and tracked by the recommendation system.The system becomes intelligent about consumer shopping behavior and produces implicit ratings over time.Data mining technologies are helping improve the precision of recommendation systems.In Section 2,we analyze six kinds of technologies with respect to their ability to potentially ameliorate sparse problems related to collaborative filtering algorithms:(1) offering default values,(2) combining content-based filtering,(3) reducing dimensionality,(4) drawing graph-theoretic approach,(5) predicting item ratings,and(6) adding user-system interactions.In Section 3,we analyze the performance of these six technologies qualitatively.The results show that dimensionality reduction is the best technology;however,its algorithms are can be hard to program.Hence,dimensionality reduction can be used as a main technology to ameliorate the sparse problem,and deserves further research and improvement.The other five technologies can assist the process of removing the sparse problem.This paper concludes with future research directions on the sparse problem in collaborative filtering systems.Future research directions include deeply combining collaborative filtering with content-based filtering,integrating with web log mining technologies,building efficient rating encouragement mechanisms,and sharing customer data with corporation management systems.
出处
《管理工程学报》
CSSCI
北大核心
2011年第1期94-101,共8页
Journal of Industrial Engineering and Engineering Management
基金
四川省教育厅青年基金项目(09ZB068)
关键词
电子商务
协同过滤
推荐算法
稀疏性
E-commerce
collaborative filtering
recommendation algorithm
sparse