期刊文献+

决策树技术在农村3岁以下儿童贫血状况研究中的应用 被引量:5

The application of decision tree in the research of anemia among rural children under 3-year-old
原文传递
导出
摘要 目的探讨决策树技术在农村儿童贫血研究中的应用。方法在SAS8.2软件的Enterprise Miner模块中,将3000例农村地区3岁以下断奶儿童的卫生保健研究数据按75%和25%分为初步拟合模型的训练集与调整模型的验证集,利用Gini杂质函数建立CART算法决策树模型,以误分率、ROC曲线、Root ASE和诊断图建立的模型进行评价。通过模型中的变量以及变量在模型中的上下层级关系,来分析农村地区3岁以下断奶儿童贫血发生的影响因素,以及影响因素间的相互作用。结果CART决策树模型中训练集和验证集的误分率分别为21.2%、21.9%,RootASE为0.399、0.404;模型的ROC曲线高于参考线,有较大的曲线下面积;诊断图中实际值和预测值相一致的比例最大,正确分类的观察符合率明显高于错误分类的观察符合率;决策树模型共筛选出9个影响儿童贫血的重要因素,并按影响因素间的相对重要性进行了排序,其中母亲是否贫血(1.00)是最重要的影响因素,其他的是儿童的月龄(0.75)、儿童的断奶时间(0.53)、孩子母亲的年龄(0.32)、添加鸡蛋的时间(0.26)、项目县分类(0.26)、添加鲜奶的时间(0.16)、家庭人口数(0.13)和母亲受教育年限(0.12)。结论决策树技术为有效分析儿童保健研究方面的资料提供一种新的思路。 Objective To study the application of decision tree in the research of anemia among rural children. Methods In the Enterprise Miner module of software SAS 8. 2,3000 observations were sampled from database and the decision tree model was built. The model using decision tree of CART bases on Gini impurity index. Results The misclassification rate of decision tree model was, training set 21.2% , validation set 21.9%. The Root ASE of decision tree model was, training set 0. 399, validation set 0. 404. The area under the ROC curve was larger than the reference line. The diagnostic chart showed that the corresponding percentage was higher than the other. The decision tree model selected 9 important factors and ranked them by their power, among which mother of anemia ( 1.00 ) was the most important factor. Others were children's age (0.75), time of ablactation(0. 53 ), mother's age( 0. 32 ), the time of egg supplementation (0. 26), category of the project county(0.26), the time of milk supplementation (0. 16), number of people in the family (0. 13) ,the education status of the mother (0. 12). Decision tree produced simple and easy rules that might be used to classify and predict in the same research. Conclusion Decision tree could screen out the important factors of anemia and identify the cutting-points for factors. With the wide application of decision tree, it would exhibit important application values in the research of the rural children health care.
出处 《中华预防医学杂志》 CAS CSCD 北大核心 2009年第5期434-437,共4页 Chinese Journal of Preventive Medicine
基金 卫生部与联合国儿童基金会资助项目(YH001) 国家自然科学基金(30771866)
关键词 决策树 贫血 儿童 误分率 Decision tree Anemia Child Misclassification rate
  • 相关文献

参考文献8

二级参考文献29

  • 1荫士安.中国婴幼儿的生长发育与辅食添加现状[J].中国儿童保健杂志,2004,12(6):509-511. 被引量:114
  • 2吴家刚,方亚.女性乳腺癌危险因素研究进展[J].医学与社会,2005,18(1):16-18. 被引量:33
  • 3杨晓光,孔灵芝,翟凤英,马冠生,金水高,中国居民营养与健康状况调查技术执行组.中国居民营养与健康状况调查的总体方案[J].中华流行病学杂志,2005,26(7):471-474. 被引量:174
  • 4张坚,满青青,王春荣,李红,由悦,翟屹,李莹,赵文华.中国18岁及以上人群血脂水平及分布特征[J].中华预防医学杂志,2005,39(5):302-305. 被引量:112
  • 5Kenneth B, Kathryn D, Lindsay A. 发展中国家的幼儿辅食添加.北京:亨氏营养科学研究所,1998.11-31.
  • 6Kass G.An exploratory technique for investigating large quantities of categorical data.J Appl Stat,2002,29:119-127.
  • 7Biggs D,de Ville B,Suen E.A method of choosing multi-way partitions for classifications and decision trees.J Appl Stat,1991,18:49-62.
  • 8Dumitrescu RG,Cotarla I.Understanding breast cancer risk -where do we stand in 2005.J Cell Mol Med,2005,9:208-221.
  • 9Moore DB,Folsom AR,Mink P J,et al.Physical activity and incidence of postmenopausal breast cancer.Epidemiology,2000,11:292-296.
  • 10Stoltzfus RJ.Iron-deficiency anemia:reexamining the nature and magnitude of the public health problem.Summary:implications for research and programs.J Nutr,2001,131:697S-700S.

共引文献143

同被引文献45

引证文献5

二级引证文献12

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部