期刊文献+

面向中文文本的敏感信息识别方法研究

Research on Methods for Sensitive Information Detection in Chinese Text
下载PDF
导出
摘要 为了避免互联网上不良敏感信息的泛滥,创建干净、文明的用网环境,本文研究中文文本的敏感信息识别问题。基于调研分析,提出由敏感词库构建、可疑文本发现和敏感信息识别三阶段组成的识别框架,并给出每阶段的执行策略和方法。对基于Word2vec的敏感词库扩充方法进行了实验,结果表明该方法具有显著效果。 To prevent the proliferation of inappropriate and sensitive information on the internet and to create a clean and civilized online environment,this article investigates the issue of sensitive information detection in Chinese text.Based on survey analysis,a detection framework composed of three stages-the construction of a sensitive word library,the discovery of suspicious text,and sensitive information detection-is proposed,along with strategies and methods for each stage.Experiments were conducted on a method of expanding the sensitive word library based on Word2vec,and the results showed that this method had significant effects.
作者 董思源 王子扬 章坤 孙美凤 DONG Siyuan;WANG Ziyang;ZHANG Kun;SUN Meifeng(Guangling Collage of Yangzhou University,Yangzhou Jiangsu 225000;Yangzhou Baoyang Digital Technology Company,Yangzhou Jiangsu 225000)
出处 《软件》 2024年第3期51-53,共3页 Software
基金 2023年江苏省大学生创新创业训练计划资助项目(202313987020Y)。
关键词 Word2vec 敏感信息识别 中文文本 Word2vec sensitive information detection Chinese text
  • 相关文献

参考文献6

二级参考文献35

共引文献54

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部