摘要
针对基于活动序列的用户行为相似性度量方法未见考虑活动的语义相似性度量,提出一种支持活动语义度量的用户行为相似性计算方法。首先结合活动间的邻接关系与标签文本语义计算活动间的相似度;其次,定义了活动编辑权值函数和活动序列距离;最后,利用活动序列多重集建模用户行为并利用推土机距离计算用户行为相似度。与目前主流算法在度量性质可满足性、现实数据集实验评估等方面进行对比分析,验证了所提方法的可行性和有效性。
The similarity measure of user behavior based on behavior sequence had not considered the semantic similarity index of activity. To solve this problem, a new algorithm of user behavior similarity measurement was proposed which supported activity semantics. Specifically, the similarity of activities was calculated by combining adjacency relation with label text semantics between activities; the edit weight function and the behavior sequence distance were defined; the users' behavior was modeled with behavior sequence multiple sets and the similarity of user behavior was calculated with Earth Mover's Distance (EMD). The feasibility and effectiveness of the proposed method were verified by comparing with the current mainstream algorithms in terms of measurement properties, satisfiability and experimental evaluation of real data sets.
作者
林泽东
曾庆田
段华
鲁法明
邹杰
LIN Zedong 1, ZENG Qingtian 1, DUAN Hua 2, LU Faming 1,ZOU Jie 3(1. College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao 266590, China; 2. College of Mathematics and Systems Science, Shandong University of Science and Technology, Qingdao 266590, China; 3.Research Institute of Highway, Ministry of Transport, Beijing 100088, Chin)
出处
《计算机集成制造系统》
EI
CSCD
北大核心
2018年第7期1806-1815,共10页
Computer Integrated Manufacturing Systems
基金
国家自然科学基金资助项目(71704096
61602278
61602279
61472229
31671588)
山东省科技发展计划资助项目(2014GGX101035
2016ZDJS02A11)
山东省自然科学基金(BS2014DX013
ZR2015FM013
ZR2017MF027)
国家海洋局海洋遥测工程技术研究中心开放基金(2018002)
交通运输部公路科学研究院项目(2015-9024
2016-9027)
山东省博士后创新专项资金资助项目(201603056)
山东科技大学领军人才与优秀科研团队计划资助项目(2015TDJH102)
教育部人文社会科学研究项目(No.16YJCZH012)~~
关键词
用户行为相似度
文本语义相似度
相似性度量
EMD距离
user behavior similarity
text semantic similarity
similarity measure
earth mover's distance