期刊文献+

面向兴趣点推荐系统的自然噪声过滤算法

Natural Noise Filtering Algorithm for Point-of-Interest Recommender Systems
下载PDF
导出
摘要 推荐系统源数据中存在着固有的自然噪声,给推荐算法带来了误差与干扰。现有研究更加关注以各类安全攻击为代表的恶意噪声,仅有少数文献针对更为隐蔽、更难处理的自然噪声进行研究,且这些研究几乎都集中在传统推荐领域。在兴趣点推荐场景中,无论是源数据特征,还是自然噪声的产生原因和表现方式,均与传统推荐领域有较大差别。针对兴趣点推荐系统中的自然噪声,提出了基于离散特征量化与聚类距离分析的自然噪声过滤算法NFDC。该算法定义并计算用户签到数据的离散度,量化数据驱动的不确定性,利用推荐算法的准确度(F1值)量化预测驱动的不确定性,深入挖掘两者之间的相关性,构建经验模型,推导潜在自然噪声比例;采用模糊C均值聚类方法分析用户行为模式的相似性,在聚类距离分析的基础上筛选可疑噪声,并自定义噪声验证规则,删除真正的自然噪声。在两个真实的位置社交网络数据集(Brightkite和Gowalla)中,分别采用NFDC算法和其他4种基准方法对源数据进行预处理,将处理后的数据集分别输入到5类代表性的兴趣点推荐算法中,对比不同的降噪技术对提升各类兴趣点推荐算法准确性的影响程度。实验结果表明,NFDC算法能够有效降低系统源数据中的自然噪声,为后续的推荐算法提供可靠的输入。与其他降噪数据集中的最高推荐精度相比,各类推荐算法在NFDC处理后的Brightkite和Gowalla数据集中的准确度分别平均提高了15.95%和5.00%。 The inherent natural noise in the original dataset of recommender systems(RSs)causes error and interference to re-commendation algorithms.Existing studies pay more attention to the malicious noise represented by various security attacks.The natural noise which is more subtle and difficult to deal with has rarely been documented.Most researches about natural noise are conducted for conventional RSs.However,the data feature and the causes and forms of natural noise in point-of-interest(POI)RSs are all different from those in conventional RSs.To filter the natural noise for POI RSs,a novel natural noise filtering method(NFDC)based on dispersion quantification and clustering distance analysis is proposed.The dispersion of a subset of the original check-in dataset is defined and calculated to indicate the data-driven uncertainty,and the accuracy metric F1 is adopted to represent the prediction-driven uncertainty.The measures of dispersion and accuracy metric vectors are empirically categorized to identify the proportion of the potential noise.The fuzzy C-means-based denoi-sing algorithm is performed to analyze the similarity of user behavior patterns and then screen the potentially noisy points based on clustering distance analysis.A customized rule is designed to further verify and delete the natural noise.Extensive experiments are conducted on two real-world location-based social network datasets,Brightkite and Gowalla.The datasets processed by NFDC and the other four benchmark algorithms are respectively input into five representative POI recommendation algorithms for comparison.Experimental results show that NFDC effectively filters the natural noise and provides reliable input for RSs.Compared with the highest accuracy supported by other denoi-sing methods,the accuracy in NFDC-processed Brightkite and Gowalla datasets is respectively improved by 15.95%and 5.00%on average.
作者 朱俊 韩立新 宗平 徐逸卿 夏吉安 唐铭 ZHU Jun;HAN Lixin;ZONG Ping;XU Yiqing;XIA Ji’an;TANG Ming(School of Computer and Software,Nanjing Vocational University of Industry Technology,Nanjing 210023,China;College of Computer and Information Engineering,Hohai University,Nanjing 211100,China)
出处 《计算机科学》 CSCD 北大核心 2023年第11期132-142,共11页 Computer Science
基金 国家自然科学基金(41771251) 江苏省高校自然科学研究项目(21KJB520009) 南京工业职业技术大学引进人才科研启动基金(YK23-05-01)。
关键词 推荐系统 兴趣点推荐 自然噪声 不确定性 离散度 聚类 Recommender system Point-of-Interest recommendation Natural noise Uncertainty Dispersion Clustering
  • 相关文献

参考文献5

二级参考文献13

共引文献33

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部