期刊文献+

基于transformer的非结构化文档敏感数据识别方法研究

Research of unstructured sensitive information recognition based on transformer
下载PDF
导出
摘要 当前针对结构化的敏感数据识别方法已趋于完善,然而对于非结构化文档类的智能识别仍然处于研究阶段。基于此迫切需求,本文提出一种基于transformer的非结构化文档敏感数据识别方法,融合Word2vec词嵌入模型和transformer模型,通过自注意力机制有效获取上下文的语义关系,并利用并行计算实现快速高效的识别。最后对实验数据进行模拟和计算,得到了较高的识别准确率,证明了本算法的有效性。 At present,the identifi cation methods for structured sensitive data have been perfected.However,the intelligent identifi cation of unstructured document sensitive data is still in the research stage.Based on this urgent need,this paper proposes a transformer-based method for identifying sensitive data in unstructured documents.It combines Word2Vec word embedding model and transformer model to effectively capture the semantic relationships within the context,and parallel computing is employed to achieve fast and effi cient identifi cation.Through the calculation and simulation of experimental data,a better accuracy rate is obtained by using this algorithm,which proves the eff ectiveness of this algorithm.
作者 徐世权 倪宁宁 刘佳 张高山 陈敏时 XU Shi-quan;NI Ning-ning;LIU Jia;ZHANG Gao-shan;CHEN Min-shi(China Mobile Group Co.,Ltd.,Beijing 100032,China;China Mobile Group Design Institute Co.,Ltd.,Beijing 100080,China)
出处 《电信工程技术与标准化》 2023年第9期28-32,36,共6页 Telecom Engineering Technics and Standardization
关键词 敏感数据识别 非结构化 TRANSFORMER 数据安全 sensitive information recognition unstructured transformer data security
  • 相关文献

参考文献2

二级参考文献15

共引文献9

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部