摘要
暗网用户在地下市场从事大量违法犯罪活动,暗网的匿名性给暗网用户之间的沟通交流带来了极大的便利,但也给执法人员带来了极大困难。近年来,深度神经网络在各个领域取得广泛成功,越来越多的研究者开始利用神经网络对匿名的网络文本作者进行身份识别。为了更好地进行暗网用户对齐,寻找更多同一身份的不同用户,笔者借用神经网络方法进行暗网用户身份识别和对齐。然而已有的方法主要面向短文本,不擅长处理全局和长序列信息。文中提出了一种自注意机制来增强卷积算子,利用长序列信息来建模暗网用户发表的网络文本的方法,从文本内容入手,对匿名的暗网用户进行多账号关联,达到聚合多个匿名账号信息的目的,为获取用户的真实身份提供更多线索。笔者在两个不同的暗网市场论坛进行全面评估,将提出的方法与当前最先进的技术进行了比较。结果表明提出的方法非常有效,在两个公开数据集上平均检索排名(MRR)分别提高约2.9%和3.6%,Recall@10分别提高约2.3%和3.0%。这项评估为该方法在暗网市场论坛中的有效性提供了强有力的证据。
Dark network users engage in a large number of illegal and criminal activities in the underground market.The anonymity of the dark network brings great convenience to the communication between users of the dark network,but great difficulties to the police.In recent years,the deep neural network has been widely successful in various fields,and more and more researchers have begun to use the neural network to identify anonymous network text authors.In order to better align users in the dark web and find more different users with the same identity,we use the neural network method to identify and align users in the dark web.However,the existing methods focus mainly on the short text and are not good at dealing with the global and long sequence information.In this paper,we propose a self-attention mechanism to enhance the convolution operator and use long sequence information to strengthen the user representation,named DACN.DACN starts from the text content,and multiple account associations are carried out for anonymous dark web users to aggregate information from multiple anonymous accounts,proving mores clues for obtaining the users’true identity.Our recent analysis involves conducting a thorough assessment of two distinct dark web market forums,whereby we evaluate our methodology in comparison to the current state-of-the-art techniques.Experimental results show that our approach is remarkably effective,with a demonstrated average mean retrieval ranking(MRR)enhancement of 2.9%and 3.6%,as well as an improved Recall@10 of 2.3%and 3.0%.This evaluation offers robust evidence of the efficacy of our approach in dark web market forums.
作者
杨燕燕
杜彦辉
刘洪梦
赵佳鹏
时金桥
王学宾
YANG Yanyan;DU Yanhui;LIU Hongmeng;ZHAO Jiapeng;SHI Jinqiao;WANG Xuebin(Department of Information Technology and Cyber Security,People’s Public Security University of China,Beijing 100038,China;School of Cyber Space Security,Beijing University of Posts and Telecommunications,Beijing 100876,China;Institute of Information Engineering,Chinese Academy of Sciences,Beijing 100080,China)
出处
《西安电子科技大学学报》
EI
CAS
CSCD
北大核心
2023年第4期206-214,共9页
Journal of Xidian University
基金
国家重点研发计划(2021YFB3100600)。
关键词
文本嵌入
注意力机制
卷积算子
长序列信息
text embedding
attention mechanism
convolutional networks
long sequence information