摘要
社会化标签的相似性评估是基于标签的链路预测和个性化推荐的基础。针对以向量空间矩阵和基于图或网络的标签共现关系来度量标签之间相似性的现有方法存在的割裂社会化标签系统“用户-资源-标签”三元内在关系及语义联系丢失问题,本文引入能系统刻画“用户-资源-标签”三元内在关系的超网络模型,提出基于超网络的社会化标签相似性评估方法。该方法从用户的社会化标注行为入手,将标签表示为节点,把用户对资源标注表示为超边,构建社会化标签超网络。在此基础上,建立基于超网络的社会化标签相似性度量的两个基本原则:共有超边原则和超边包含节点数原则,并据此构建基于超网络的系列社会化标签相似性度量指标。选取代表性社会化标签应用数据集,利用链路预测的AUC和Precision评价方法对构建的相似性指标进行实验评估,实验结果表明,基于单纯共超边原则以及综合共超边与超边包含节点数原则构建的标签相似性指标优于基于标签共现网络构建的标签相似性度量指标,特别是在Precision评价方面提升明显。
Social tags express users’preferences by a user-defined way to describe online resources and build the connections between users and resources.As a valuable resource,social tags have been exploited in link prediction and personalized recommendation to solve information overload in the era of big data.Social tags similarity evaluation is the foundational issue of tag-based link prediction and personalized recommendation.The current methods of tags similarity evaluation based on such as vector space matrix,bipartite graph,tripartite graph and tag co-occurrence network split the internal relationship of user-resource-tag in social tagging systems during their transforming processes,resulting in the loss of tags semantic association to some extent.To overcome this problem,this paper innovatively introduces the hyper-network model which can systematically describe the internal ternary relationship of user-resource-tag and proposes an approach to measuring social tags similarity based on hyper-network.The proposed approach focuses on behaviors of users’social tagging to build social tags hyper-network in which a tagging action is expressed as a hyper-edge,and tags are expressed as nodes.The constructed hypernetwork links users,resources,and tags in tagging activities by hyper-edges in that it can more accurately depicts the user’s tagging behavior and maintains the intrinsic semantic association information of the user-resource-tag ternary relationship.Combining the topological structure of the social tags hyper-network and the two fundamentals of the proximity relation rules and ternal closure for describing the degree of association and similarity of objects based on object relation,two basic principles are established for measuring social tags similarity based on the constructed hyper-network.One is the principle of common hyper-edges,that is,the more common hyper-edges of two tag nodes,the more similar the two tag nodes are.Another is the principle of the number of nodes in one hyper-edge,that is,the fewer tag nodes a hyper-edge contains,the more similar these tag nodes are.Based on these two basic principles,a series of social tags similarity measures are established by referring to the logics of constructing the similarity index between nodes in general complex networks.The experimental study is conducted to verify the constructed similarity measures on the data sets from two representative social tagging applications of Delicious and Last.fm by using the AUC and Precision evaluation methods of link prediction.In term of the AUC and Precision criterions in the link prediction,the experimental results show that the tags similarity measures constructed on the principle of pure common hyper-edge and the combined principles of the number of nodes in one hyper-edge and common hyper-edge have better performances,which are obviously better than the tag similarity index constructed on the tags co-occurrence network.Especially,the distinct improvement in the Top-N Precision evaluation of link prediction has positive significance for improving the accuracy of personalized recommendation.At the same time,the experimental results also show that adding different normalization ways of node hyper-degree into common hyper-edges have a certain negative effect on the accuracy of tags similarity measurement.The social tags similarity measures in our proposed hyper-network based approach are built by mainly combining two basic structural features of networks:Common hyper-edges of nodes and number of nodes in one common hyper-edge.However,the situations and elements of affecting tags semantic similarity are complicated.For example,the“weak connection effect”existing in networks may affect the prediction effect of the method reflecting the strong connection relationship by a common hyper-edge,which is worth further exploration.In addition,social tags hyper-networks also have many other topological features,such as distance and path between nodes.Further work can explore the relationship between such topological features of social tags hyper-networks and the similarity of tag nodes,so as to build more effective social tags similarity measures.
作者
潘旭伟
曾雪梅
李涛
PAN Xuwei;ZENG Xuemei;LI Tao(School of Economics and Management,Zhejiang Sci-Tech University,Hangzhou 310018,China)
出处
《运筹与管理》
CSSCI
CSCD
北大核心
2023年第9期215-221,共7页
Operations Research and Management Science
基金
浙江省自然科学基金重点项目(LZ18G010001)。