摘要
对中文电子病历中的否定术语进行检测,可以为非结构化的电子病历文本的概念索引的建立提供依据。对于电子病历中术语的提取,在经典的正向最大匹配算法的基础上,结合互信息,可以有效地避免覆盖性歧义对提取结果的影响;对于否定语义的确定,在基于规则算法的基础上,结合词共现率模型,有效地降低了由于标点录入错误而出现假阳性术语的概率。通过实验表明,本文提出的方法相对于传统的基于规则的算法,阴性结果的预测值提高了6.85%。
The method for detecting the negative terms in Chinese electronic medical record(EMR)is useful in providing evidence for constructing concept index.In this respect,we adopted an improved method which combined maximum matching with mutual information in order to extract terms in EMRs.This method can overcome the influence of overlay ambiguity.In addition,for the determination of negative semantic,we also adopted an improved method which combined rule-based method with word co-occurrence.This new method can reduce the probability of appearance of false positive terms caused by punctuation input errors.The result showed that the negative predictive value is 7.85% higher than the rule-based method.
出处
《生物医学工程学杂志》
EI
CAS
CSCD
北大核心
2015年第1期82-85,共4页
Journal of Biomedical Engineering
基金
国家自然科学基金资助项目(81271668)
南通市社会事业科技创新与示范计划资助项目(HS2012045)
南通大学自然科学基金资助项目(11Z010)
江苏省高校自然科学基金资助项目(14KJB310014)
关键词
词共现
正向最大匹配
互信息
否定术语检出
word co-occurrence
maximum matching method
mutual information
negation term detection