摘要
在文件利用环节,采用深度学习智能识别算法,实现对用户上传的待测文件内容进行自动化、智能化检测和分析评估。通过采用面向敏感词专用训练集的OCR检测深度学习算法、基于IF-IDF算法的文件敏感词主题识别算法,实现敏感词多维度智能检测,并将检测出来的敏感信息生成检测报告,供用户或审核人进行检查确认,辅助用户对敏感信息进行处理,以大幅提升敏感词检测准确度和检测效率,降低人工核验差错率,从而最大程度地保证企业数据的安全。
In the document utilization phase,the intelligent deep learning recognition algorithms are used to achieve automatic and intelligent detection,analysis and evaluation of the documents to be tested uploaded by users.By adopting the deep learning algorithm with OCR detection for the special training set of sensitive words and the theme recognition algorithm of the sensitive words in documents based on the IF-IDF algorithm,it achieves multi-dimensional intelligent detection of sensitive words,generates a detection report of the sensitive information detected for the users or the reviewers to check and confirm,and assists the users to process the sensitive information in order to greatly improve the accuracy and efficiency of the detection of sensitive words,reducing the manual verification error rate,and maximizing the security of enterprise data.
作者
邓又琦
张明
马敬济
DENG Youqi;ZHANG Ming;MA Jingji(No.724 Research Institute of CSSC,Nanjing 211153;No.719 Research Institute of CSSC,Wuhan 430205)
出处
《计算机与数字工程》
2024年第8期2435-2439,共5页
Computer & Digital Engineering
关键词
深度学习
算法
文件
敏感词
检测
deep learning
algorithm
document
sensitive word
detection