期刊文献+

基于卷积神经网络的暗网网页分类研究 被引量:1

DARKNET WEBPAGE CLASSIFICATION BASED ON CONVOLUTIONAL NEURAL NETWORK
下载PDF
导出
摘要 在海量暗网网页中筛选敏感主题内容对执法部门具有重要意义。通过对Freenet等暗网网页文本特点和类别进行深入分析,提出基于TextCNN的暗网网页主题分类模型。模型根据暗网网页非标准化的语言特点进行数据预处理;使用预训练的词向量获得网页内容的表示,通过不同大小的卷积核进行卷积操作获得特征图像,使用最大池化函数获得最终的特征向量;对卷积网络进行正则化处理,使用softmax函数预测类别概率。实验结果表明,采用该方法精确率为86.01%,召回率为78.97%,Macro-F1值为82.33%,高于机器学习模型,能够有效解决暗网网页分类问题。 It is critical for law enforcement departments to extract contents of specific topic from enormous amount of darknet webpages.After in-depth analysis on webpage texts of Freenet and other darknets,a darknet webpage topics classification model based on TextCNN is proposed.The model preprocessed the data according to the non-standardized language characteristics of darknet webpages,and then represented webpage tokens with pretrained word embeddings.The feature image was obtained by convolution operation with convolution kernels of different sizes,and the final feature vector was obtained by using the maximum pooling function.The convolution network was regularized,and the category probability was predicted by using Softmax function.The experimental results show that the model achieves precision at 86.01%,recall score at 78.97%and Macro-F1 score at 82.33%,higher than machine learning models,which can effectively solve the classification problem of darknet webpages.
作者 洪良怡 朱松林 王轶骏 薛质 Hong Liangyi;Zhu Songlin;Wang Yijun;Xue Zhi(School of Electric Information and Electrical Engineering,Shanghai Jiao Tong University,Shanghai 200240,China;Nantong Public Security Bureau,Nantong 226001,Jiangsu,China)
出处 《计算机应用与软件》 北大核心 2023年第2期320-325,330,共7页 Computer Applications and Software
基金 国家重点研发计划项目“网络空间安全”重点专项(2016QY01W0202)。
关键词 暗网 网页分类 卷积神经网络 机器学习 Darknet Webpage classification CNN Machine learning
  • 相关文献

参考文献1

二级参考文献3

共引文献1

同被引文献6

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部