摘要
随着计算机技术的发展,各个领域中的大多数文献都已数字化。本文主要使用健康文档作为原始数据,通过Web创建的健康数据,利用文本挖掘技术提取关联特征信息。使用Apriori挖掘算法,分析创建事务中的关键字的关联规则,并生成关联关键字。使用TF-C-IDF权重和关联关键字从健康数据中提取关联特征。根据在精度,召回率,F-measure和效率值方面的实验评估表明其性能很高。
With the development of computer technology, most of the literature in various fields has been digitized. This paper mainly uses health documents as source data, through the health data created by the Web, using text mining technology to extract the associated feature information. The Apriori mining algorithm was used to analyze the association rules for the keywords in the created transaction and generate the associated keywords. Association features are extracted from health data using TF-CIDF weights and associated keywords. Experimental evaluations based on accuracy, recall, F-measure and efficiency values indicate high performance.
作者
白玲玲
韩天鹏
BAI Lingling;HAN Tianpeng(Academic Affair Office,Fuyang Party Institute of CCP,Fuyang Anhui 236034,China;School of Computer and Information Engineering,Fuyang Normal University,Fuyang Anhui 236037,China)
出处
《阜阳师范学院学报(自然科学版)》
2019年第3期43-48,共6页
Journal of Fuyang Normal University(Natural Science)
基金
阜阳师范大学自然科学研究项目(2018FSKJ11)
阜阳市党校科研课(FYDXKT201937)
阜阳市规划课题(FSK2018051)资助