期刊文献+

属性约简方法在中医证候数据挖掘中的比较应用 被引量:4

Application of Attribute Reduction in Data Mining of TCM Syndrome
原文传递
导出
摘要 目的探讨3种属性约简方法在中医证候数据约简中的比较应用。方法分别采用相关性分析、主成分分析、基于粗糙集的属性约简方法对同一个原发性失眠症中医证候数据集进行约简,并构建原发性失眠肝郁化火证的C4.5决策树分类模型,用5-交叉验证法进行模型评估。结果基于粗糙集约简模型各项指标均优于其他两种约简模型,受试者工作特征曲线(ROC曲线)下面积与其他两种模型比较差异均有统计学意义(P<0.01)。相关性约简模型与主成分约简模型ROC曲线下面积差异无统计学意义(P>0.05)。结论基于粗糙集的属性约简方法能在保持较高质量分类能力的基础上,尽可能消除决策表中不必要的知识,是中医证候数据约简的可行性方法。 Objective To discuss the application of three attribute reduction methods in data reduction of syndrome in TCM.Methods Bivariate correlation analysis,principal component analysis and rough set were respectively performed for attribute reduction on the same TCM syndrome data set of primary insomnia.A C 4.5 decision tree classification models of pathogenic fire derived from stagnation of liver-QI of primary insomnia was established,evaluated by 5-fold cross-validation.Results Every index of rough set reduction model was better than the other reduction models.Its area under the ROC curve was larger than the other two models with statistic significance(P〈0.05).There was no significant different between correlation reduction model and principal component reduction model(P〉0.05).Conclusions The model built by attribute reduction method based on rough set could maintain a high capability of classification.The reduction could eliminate unnecessary knowledge from the information system(Decision Tables) as far as possible,result in a small subset with well ability of classification.And it is a feasible reduction method in TCM syndrome data processing.
出处 《中医杂志》 CSCD 北大核心 2012年第4期321-323,330,共4页 Journal of Traditional Chinese Medicine
基金 广东省建设中医药强省课题资助项目(2010134)
关键词 数据挖掘 属性约简 证候 失眠症 data mining attribute reduction TCM syndrome
  • 相关文献

参考文献11

  • 1胡镜清,刘保延,王永炎.中医临床个体化诊疗信息特征与数据挖掘技术应用分析[J].世界科学技术-中医药现代化,2004,6(1):14-16. 被引量:25
  • 2陈淑慧,陈耀龙,梁伟雄.中医药数据挖掘中属性约简方法的应用探讨[J].辽宁中医杂志,2010,37(2):245-246. 被引量:4
  • 3American Psychiatric Association. Diagnostic and Statis- tical Manual of Mental Disorders, Forth Edition, Text Revision [ M ]. Washington DC.. American Psychiatric Association, 1994: 597- 609.
  • 4American Academy of Sleep Medicine. Intemationnal clas- sification of sleep Disorders[M]. 2nd. 2005..58 - 65.
  • 5.中药新药临床研究指导原则(第一辑)[M].中华人民共和国卫生部,1993.128.
  • 6国家技术监督局.中医临床诊疗术语[S].北京:中国标准出版社.1997.1-41.
  • 7Skowron A, Rauszer C. The Discernibility Matrix and Functions In Information Systems, In Intelligent Decision Support -- Handbook of Applications and Advances of the Rough Set Theory[M]. Boston: Kluwer Academic Publishers, 1992:331-362.
  • 8Hastie T, Tibshirani R, Friedman J. The Elements of Statistical Learning: Data Mining, Inference, and Pre- :liction (Second Edition)[M]. New York: Springer, 2008:219 - 257.
  • 9王珏,王任,苗夺谦,郭萌,阮永韶,袁小红,赵凯.基于Rough Set理论的“数据浓缩”[J].计算机学报,1998,21(5):393-400. 被引量:239
  • 10Vladimir N, Vapnik. The Nature of Statistical Learning l'heory[M]. NewYork ..Springer, 1995 : 126.

二级参考文献18

  • 1王珏,苗夺谦,周育健.关于Rough Set理论与应用的综述[J].模式识别与人工智能,1996,9(4):337-344. 被引量:264
  • 2Mehmed Kantardzic. Data Mining : Concepts, Models, Methods, and Algorithms [ M ]. Wiley - IEEE Press, 2002 : 2.
  • 3Egmont-Petersen M, Talmon J L, Hasman A, et al. Assessing the importance of features for multi-layer perceptrons [ J ]. Neural Networks, 1998,11 (4) :623 -635.
  • 4Dietterich T G. Machine learning research:four current directions [ J ]. AI Magazine,1997,18(4) :97 - 136.
  • 5Marx KA, O' Neil P, Hoffman P, et al.Data mining the NCI cancer cell Line compound GI(50) values: identifying quinone subtypes effective against melanoma and leukemia cell classces. J Chem Inf Comput Sci. 2003, 43(5): 1652-1667.
  • 6Peterson C, Riugner M. Analyzing tumor gene expression profiles. Artif lntell Med. 2003,28(1):59 -74.
  • 7Lavrac N. Selected techniques for data mining in medicine Artif Intell Med. 1999. 16(1):3-23.
  • 8Hsia TC, Chiaug HC. Chiang D. et al Prediction of survival iH surgical mlreseetable lung cancer hy artificial neural networks including genetic polymorphisms and clinical parameters. J Cliu Lab Anal. 2003.1716):229 - 234.
  • 9Gardner SN, Fernandes M. New tools for cancer chemotherapy: computational assistance for tailoriag treatments. Mol Cancer Ther. 2003. 2(10): 1079-1084.
  • 10Cerrito P. Application of data mining for examining polypharmacy and adverse effects in cardiology patients. Cardiovasc Toxicol.2001, 1(3): 177-179.

共引文献303

同被引文献87

引证文献4

二级引证文献41

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部