期刊文献+

基于SVM和CRF双层分类器的英文电子病历去隐私化 被引量:9

De-identification on electronic medical records using a two tier classifier based on SVM and CRF
下载PDF
导出
摘要 去隐私化是2014 i2b2/UTHealth中的一个任务,目的在于识别并移除电子病历中的隐私信息。本文提出了一种基于支持向量机(SVMs)和条件随机场(CRFs)双层分类模型的去隐私化方法,经过预处理将病历文本进行词切分(tokenize)处理,并在此基础上抽取4类特征,训练SVM模型对隐私信息实体边界进行划分并将结果作为特征添加到特征集中,通过CRF训练多分类器,并通过该分类器对各个类别的隐私信息进行识别。实验表明双层分类模型对于隐私信息识别是有效的,结果 F值达到0.9110。 De-identification is a shared task of the 2014 i2b2/UTHealth challenge which aimed at removing protected personal information( PHI) from electronic medical records. This paper proposes a two tier classifier based on support vector machines( SVMs)and conditional random fields( CRFs). Electronic medical records are tokenized through a preprocessing module,and four types of features are generated to train a SVM classifier to identify the boundary of PHI entities,results of the SVM classifier is used as new features to train a CRF classifier. The experiments show that the two tier classifier is effective in de-identification of electronic medical records and achieving a F-measure of 0.9110.
出处 《智能计算机与应用》 2016年第6期17-19,24,共4页 Intelligent Computer and Applications
关键词 电子病历 去隐私化 SVM CRF electronic medical records de-identification SVM CRF
  • 相关文献

参考文献2

二级参考文献150

  • 1车万翔,刘挺,李生.实体关系自动抽取[J].中文信息学报,2005,19(2):1-6. 被引量:116
  • 2林东,邵军力.医学诊疗领域通用专家系统设计与实现[J].自动化学报,1995,21(3):380-382. 被引量:6
  • 3邢小云.美国医疗信息隐私保护立法介绍与启示[J].护理学杂志(外科版),2007,22(5):72-74. 被引量:24
  • 4C. Clifton, M. Kantarcioglu, J. Vaidya. Defining Privacy for Data Mining [C] //Baltimore, MD, USA: Proc. of the National Science Foundation Workshop on Next Genera- tion Data Mining, 2002:126 -133.
  • 5中华人民共和国卫生部.基于电子病历的医院信息平台建设技术解决方案[S].2010.11.
  • 6R. Agrawal, R. Srikant. Privacy Preserving Data Mining [C]. // Proceedings of the ACM Conference on Manage- ment of Data, 2000 : 439 - 450.
  • 7I. Blanquer V. Hem', D. Segrelles. Enhancing Privacy and Authorization Control Scalability in the Grid through On- tologies [J].IEEE Transactions on Information Technology in Biomedicine, 2009, 13 (1) : 16 - 24.
  • 8I. Maglogiannis, L. Kazatzopoulos, K. Delakouridis, et al. Enabling Location Privacy and Medical Data Encryption in Patient Telemonitoring Systems [ J ~. IEEE Transactions on Information Technology in Biomedicine, 2009, 13 (6) : 946 - 954.
  • 9C. Clifton, M. Kantarcioglou, X. Lin, et al. Tools for Pri- vacy Preserving Distributed Data Mining[J]. ACM SIGK- DD Explorations, 2002, 4 (2) : 28 -34.
  • 10G. Jagannathan, K. Pillaipakkamnatt, R. N. Wright. A New Privacy - preserving Distributed k - Clustering Algo- rithm [ C ] // Proceedings of the 2006 SIAM International Conference on Data Mining, 2006 : 492 - 496.

共引文献142

同被引文献95

引证文献9

二级引证文献142

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部