期刊文献+

基于最大概率法探讨中医症状信息提取与标准化 被引量:5

Discussion on the extraction and standardization of TCM symptom based on maximum probability method
原文传递
导出
摘要 目的:通过比较两个基于最大概率法的症状提取方案,探讨中医症状信息的提取和标准化。方法:数据分析和处理在R 3.3.2上进行。运用《诊断学》《中医诊断学》及1 000份已标记的肺炎住院病历建立症状标准化数据库,症状描述词库和关键词-形容词词库。基于最大概率法分别设计出中文分词方案,直接提取方案和组合提取方案。并用这3种方案对2 311份肺炎病历进行症状信息提取和标准化,从产生维度、手工处理情况、症状提取效果对方案进行比较。结果:直接提取方案和组合提取方案均能有效降低维度,组合提取方案手工处理百分比较小和症状提取效果较好。结论:基于最大概率法的组合提取方案能有效提取中医症状信息。 Objective: To discuss the extraction and standardization of traditional Chinese medicine symptom by comparing two symptom extraction programs based on the maximum probability method. Methods: All data were analyzed and processed on R 3.3.2. Diagnostics, Diagnostics of Traditional Chinese Medicine and 1 000 marked pneumonia hospitalized medical records were used to establish symptomstandardization database, symptom description lexicon and keyword-adjective lexicon. Based on the maximum probability method, Chinese word segmentation program(CSP), direct extraction program(DEP) and combination extraction program(CEP) weredesigned respectively. And these three programs were used to extract and standardize the symptoms of 2 311 pneumonia medical records,and the results were compared with each other bygenerating dimension, manual processing and the efficiency of symptom extraction. Results: Compared with CSP, CEP and DEP were effective in reducing the dimension. And CEP was lower on the manual processing rate and more efficient on the symptom extraction. Conclusion: CEP based on the maximum probability methodcan effectively extract TCM symptom information.
机构地区 广州中医药大学
出处 《中华中医药杂志》 CAS CSCD 北大核心 2017年第5期2159-2162,共4页 China Journal of Traditional Chinese Medicine and Pharmacy
基金 教育部博士点基金项目(No.20114425110009)~~
关键词 症状 文本挖掘 文本数据结构化 中文分词 最大概率法 标准化 Symptom Text mining Text data structure Chinese word segmentation Maximum probability method Standardization
  • 相关文献

参考文献5

二级参考文献62

共引文献118

同被引文献93

引证文献5

二级引证文献14

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部