摘要
旅游景点数量庞大,而用户本身旅游次数有限,所以用户旅游数据非常稀疏,进而影响了推荐结果的准确度.为了解决这一问题,从海量游记中提取与旅游景点密切相关的4个因素——地域、时间、主题、类型相关的特征标签,来丰富数据信息.一方面通过基于标签内容的方法为用户推荐感兴趣的景点;另一方面,用景点特征标签描述用户兴趣特征,根据用户兴趣标签找到相似用户群,通过协同过滤的方法为用户推荐感兴趣的景点.实验结果表明,基于标签的协同过滤算法较传统的协同过滤算法推荐准确率提高了63.7%,比基于景点热度的推荐算法提高了22.5%;基于标签内容的推荐算法比基于景点热度的推荐算准确率提高了27.6%.进一步,通过线性加权的方式混合两种算法,使两种算法优势互补,从而得到更好的推荐效果.最终使得基于标签的混合算法的准确率,比基于标签的协同过滤算法提高了61.3%,比基于标签内容的推荐算法提高了54.7%.旅游景点推荐准确度的提高,将带来更好的用户体验,使在线旅游网站更加具有竞争力.
The disparity between the huge number of tourist attraction and the limited number of trigs made by tourists has resulted in the sparseness of tourist travel data,which seriously affects the accuracy of the recommendation results.To solve this problem,four kinds of tags area,time,topic,type were extracted from a mass of travel notes to enrich the data.On the one hand,travel attractions were recommended to users by tag-content-based recommendation algorithm.On the other hand,user interest features were described with attractions feature tags.Then,similar users were found according to the interest tags of users and attractions were recommended through collaborative filtering.The tag-based collaborative filtering algorithm by 63.7% compared with the collaborative filtering recommendation algorithm and by 22.5%compared with the attraction-heat-based recommendation algorithm.Tag-content-based recommendation algorithm can improve the accuracy by 27.6% compared with the attraction-heatbased recommendation algorithm.The two algorithms were further combined with linear weight so that the two algorithms complement each other,resulting in better recommendation results.Our tag-based hybrid algorithm can make a significant improvement,i.e.increasing the accuracy by 61.3% over the tagbased collaborative filtering algorithm and 54.7% over the tag-content-based recommendation algorithm.The improvement of recommendation accuracy will enhance the user experience and make online travel websites more competitive.
作者
李雅美
王昌栋
LI Yamei WANG Changdong(School of Data and Computer Science, Sun Yat-sen University, Guangzhou 510006, China)
基金
国家自然科学基金(61502543)
广东省自然科学基金杰出青年项目(2016A030306014)
广东省自然科学基金博士启动项目(2014A030310180)
中央高校基本科研业务费专项(16lgzd15)资助
关键词
推荐系统
个性旅游
数据挖掘
基于标签
协同过滤
基于内容
混合推荐
recommendation system
personalized travel
data mining
tag-based
collaborative filtering
content-based
hybrid recommendation