摘要
针对现有基于生成文本和社交关系的联合位置推断方法对社交媒体中异质数据间的位置关联性挖掘不充分的问题,提出了一种基于多种提及关系的社交媒体用户位置推断方法。首先,综合考虑社交媒体文本中用户之间的提及关系、用户对位置指示词的提及关系和用户对地理名词的提及关系,构建包含用户、位置指示词和地理名词3种节点的异质网络;其次,基于共同提及关系提出用户−词语−位置简化算法来构建用户−位置异质网络,将位置邻近的用户更为紧密地联系起来;再次,提出有偏的随机游走算法对图中节点采样以充分探索网络结构,缓解已知位置的稀疏性问题;最后,采用基于多层感知机的神经网络分类器对用户进行位置推断。在GEOTEXT、TW-US和TW-WORLD这3个代表性Twitter数据集上的实验结果表明,所提方法可显著提高用户位置推断准确率。
Aiming at the problem that the existing joint user geolocalization methods based on social media text and social relationships do not sufficiently mine the location correlation between heterogeneous data in social media,a social media user geolocalization method based on multiple mention relationships was proposed.First,a heterogeneous network was constructed by comprehensively considering the mention relationship between users,the user's mention relationship with location indicative words,and the user's mention relationship with geographic nouns.Then,a network simplification strategy was proposed to construct a user-location heterogeneous network that connects users live nearby more closely based on the common mention relationship.After that,a biased random walk algorithm was proposed for the graph node sampling to fully explore the network structure and alleviate the sparsity problem of known locations.Finally,a neural network classifier based on a multilayer perceptron was used to infer the user's location.Experimental results on three representative Twitter data sets of GEOTEXT,TW-US and TW-WORLD show that the proposed method can significantly improve the user geolocalization accuracy.
作者
乔亚琼
罗向阳
马江涛
李晨亮
张萌
李瑞祥
QIAO Yaqiong;LUO Xiangyang;MA Jiangtao;LI Chenliang;ZHANG Meng;LI Ruixiang(Information Engineering University,Zhengzhou 450001,China;State Key Laboratory of Mathematical Engineering and Advanced Computing,Zhengzhou 450001,China;School of Computer and Communication Engineering,Zhengzhou University of Light Industry,Zhengzhou 450001,China;School of Cyber Science and Engineering,Wuhan University,Wuhan 430075,China)
出处
《通信学报》
EI
CSCD
北大核心
2020年第12期72-81,共10页
Journal on Communications
基金
国家自然科学基金资助项目(No.U1804263,No.U1636219,No.61872287,No.U1736214)
国家重点研发计划基金资助项目(No.2016QY01W0105,No.2016YFB0801303)
中原英才计划−中原科技创新领军人才基金资助项目(No.1052020KJLJ0025)
河南省科技创新人才计划基金资助项目(No.184200510018)
河南省科技攻关基金资助项目(No.202102310237)。
关键词
社交媒体
异质网络
用户位置推断
提及关系
social media
heterogeneous network
user geolocalization
mention relationship