期刊文献+

基于自然语言处理的文本泄密自动检测技术 被引量:2

Auto-detection technology of text divulgence based on natural language processing
下载PDF
导出
摘要 因文本信息泄密导致的危害越来越严重,但传统的泄密检测还停留在人工查看,效率低且易造成二次泄密。针对以上问题,采用文本相似度自动比较和数据加密方法,提出了一种基于自然语言处理的文本泄密自动检测技术。在实际应用中,因检测粒度过粗可能导致漏检,采用基于自然段落和语句的相似度检测方法,方便疑似段落和语句的自动定位,最后设计并实现了一个文本泄密检测系统。实验结果表明,该技术能很好地应用于涉密文本泄密的检测,具有保密、人工干预少、效率高、疑似段落定位等特点。 The damage caused by text divulgence is getting more and more serious while the divulgence detecting remain in the level of manual operation, which is ineffective and easily lead to secondary divulgence. Aimed at the above questions, a auto-detection technology of text divulgence based on natural language processing is proposed by the method of text similarity auto comparison and data eneryption. In practical applications, due to the coarsness of detection, there are the possibility of detection omission. The method of similarity detection based on natural paragraph and sentences is used, which facilitate location of them. Finally, a text divulgence detection system is designed and implemented. The result of the experiment demonstrates that the technology can be used in the detection text divulgence with the feature of privacy, less manual intervention, efficiency, suspected paragraph positioning and so on.
出处 《计算机工程与设计》 CSCD 北大核心 2011年第8期2600-2603,共4页 Computer Engineering and Design
基金 中国博士后科学基金项目(20080431114) 南京信息工程大学校科研基金项目(20070113)
关键词 自然语言处理 文本泄密 加密 相似度检测 信息抽取 natural language processing text divulgence encryption similarity examination information extraction
  • 相关文献

参考文献7

二级参考文献50

共引文献390

同被引文献18

  • 1杨晓春,刘向宇,王斌,于戈.支持多约束的K-匿名化方法[J].软件学报,2006,17(5):1222-1231. 被引量:60
  • 2Fung B C M, Wang Ke, Chen Rui, et al.. Privacy-preserving datapublishing: a survey on recent developments [J]. ACM ComputingSurveys, 2010,42 (4): 1-53.
  • 3Li Tiancheng, Li Ninghui, Zhang Jian, et al.. Slicing: a newapproach for privacy preserving data publishing [J]. IEEE Trans-actions on Knowledge and Data Engineering,2012,24 (3): 561-574.
  • 4Tristan A, Benjamin N, Philippe P. Towards a safe realization ofprivacy-preserving data publishing mechanisms [C] //Proceedingsof Mobile Data Management, Lulea, Sweden, 2011: 31-34.
  • 5Kim J. A method for limiting disclosure of microdata based on ran-dom noise and transformation [C] //Proceedings of the Section onSurvey Research Methods of the American Statistical Association,Washington DC, 1986: 370-374.
  • 6Palley M, Siminoff J, Regression methodology based disclosureof a statistical database [C] //Proceedings of the Section on SurveyResearch Methods of the American Statistical Association, Wash-ington DC, 1986: 382-387.
  • 7Samarati P, Sweeney L. Protecting privacy when disclosing infor-mation: K-anonymity and its enforcement through generalizationand suppression[R].SRI Computer Science Laboratory, 1998:1-19.
  • 8Sweeney L. -anonymity: a model for protecting privacy [J]. In-ternational Journal of Uncertainty,Fuzziness and Knowledge-Bas-ed Systems, 2002,10 (5): 557-570.
  • 9Sweeney L. Achieving jt-anonymity privacy protection using gen-eralization and suppression [J] .International Journal of Uncertainty,Fuzziness and Knowledge-Based Systems, 2002,10 (5): 571-588.
  • 10Park H, Kyuseok S. Approximate algorithms for -anonymity[C] //Proceedings of SIGMOD, Beijing, 2007: 67-78.

引证文献2

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部