摘要
当前用户社区划分的研究较少关注对象与属性间的从属性关联,导致特征选择性能不高、精确度较低、鲁棒性较差以及运算量大的问题.文中在充分考虑用户和偏好主题词之间的关联影响关系的基础上,提出一种基于层次耦合聚类的用户社区划分方法,并通过4种分类算法的比较来确定最优阈值和实现偏好领域的最优划分.实验结果表明该方法具有较好的社交媒体用户偏好识别和社区划分性能,以分类评价指标AUC数值作为选取聚类阈值的标准,选取聚类阈值,极大的减少人为因素的影响.文中所提出的方法切实有效,有助于提升用户偏好识别和社区划分的相关性能.
The research on user community division pays less attention to the subordinate attribute association between objects and attributes,which leads to the problems of low performance,low accuracy,poor robustness and large amount of computation in feature selection.We propose a user community division method based on hierarchical coupling clustering on the basis of fully considering the relationship between users and preferred subject words,determine the optimal threshold through the comparison of four classification algorithms and achieve the optimal division of preference fields.Experimental results show that this method has good performance in social media user preference recognition and community division.Taking the value of classification evaluation index AUC as the standard for selecting clustering threshold,the clustering threshold is selected to greatly reduce the impact of human factors.The method proposed in this study is practical and effective,and helps to improve the performance of user preference recognition and community division.
作者
刁雅静
吴嘉辉
卢健
王志英
朱庆康
DIAO Yajing;WU Jiahui;LU Jian;WANG Zhiying;ZHU Qingkang(School of Economics and Management,Jiangsu University of Science and Technology,Zhenjiang 212100,China)
出处
《江苏科技大学学报(自然科学版)》
CAS
北大核心
2023年第4期86-91,共6页
Journal of Jiangsu University of Science and Technology:Natural Science Edition
基金
江苏省社会科学基金资助项目(22GLB037)
江苏高校哲学社会科学研究重大项目(2020SJZDA065)。
关键词
偏好识别
社区划分
耦合聚类
主题公园
preference recognition
community division
coupling clustering
theme park