摘要
从大量中文病历文献中提取出重要的疾病主题,对医疗工作者学习和科研是非常重要的。为了更方便的提取出主题,本文提出结合中文分词技术与FP-Growth算法的一种方法模型。该模型可以在大量中文病历文献中,首先将病历文献划分为若干关键词组成的项目集文档,再使用FP-Growth算法,计算关键词的频繁项集,并生成病理字典,最后提取出文本的疾病主题。
In order to extract disease topics from Chinese medical records for medical researchers'scientific research,a method model combining Chinese word segmentation technology and FP-Growth algorithm is proposed.In the informationization of Chinese medical record documents, firstly, the medical record documents are divided into item set documents composed of several keywords.Then,the frequent itemsets of keywords are calculated using FP-Growth algorithm,and pathological dictionaries are generated.Finally,the disease topics of the text are extracted.
作者
王明令
纪怀猛
吴春琼
WANG Ming-ling;JI Huai-meng;WU Chun-qiong(Spatial Information Engineering Research Centre of Fujian Province,Engineering Research Center of Business Intellgent in BigData for Fujian Province College of Artificial Intelligence,Yango University,Fuzhou Fujian 350015)
出处
《数字技术与应用》
2019年第5期74-75,共2页
Digital Technology & Application
基金
2018年度福建省教育厅中青年教师科研项目(JT180725)