期刊文献+

logistic回归参数遗传算法估计的可行性研究 被引量:2

The Classification Ability of Genetic Algorithm for Pestimation of Logistic Regression Model
下载PDF
导出
摘要 目的考察遗传算法作为logistic回归模型参数估计方法的效能,并与极大似然估计法比较。方法通过数据模拟建立三种模型,分别用遗传算法和极大似然法作参数估计,考察建立模型的分类效能。结果一般情况下,极大似然估计法的分类效能稍高于遗传算法。在样本量较小或自变量关系复杂的情况下,极大似然估计法和遗传算法的泛化误差增加。极大似然估计法的泛化误差主要源于在验证集中分类效能下降,而遗传算法的泛化误差主要源于训练集中的过拟合。当样本量小且自变量关系复杂的情况下,极大似然估计法出现迭代不收敛,参数失拟合,遗传算法无此现象。结论遗传算法适用于自变量多而样本量相对小时logistic回归模型参数估计。 Objective To evaluate the genetic algorithm for the parametric estimation of logistic regression model comparing to the maximum likelihood method.Methods Three models were constructed and the sample data were simulated.Two methods were used to estimate the parameters of logistic models and their classification ability and variance was evaluated.Results The classification ability of maximum likelihood method was better than that of genetic algorithm in general samples.Generalized error increased as sample size decreased or variables related.The generalized error in maximum likelihood method caused by decreasing ability of classification in validation samples while that in genetic algorithm mainly caused by over-fitness in training samples.Maximum likelihood method was not convergent and lost of estimation of parameters when sample size is small.Conclusion Genetic algorithm is valuable for parametric estimation of logistic regression model when number of variables is big and sample size is relative small.
出处 《中国卫生统计》 CSCD 北大核心 2012年第1期74-76,共3页 Chinese Journal of Health Statistics
关键词 遗传算法 LOGISTIC回归 极大似然法 参数估计 Genetic algorithm Logistic regression Maximum likelihood method Parameter estimation
  • 相关文献

参考文献9

  • 1冯国双,陈景武,周春莲.logistic回归应用中容易忽视的几个问题[J].中华流行病学杂志,2004,25(6):544-545. 被引量:40
  • 2陈彬,李从珠.基于选择抽样下的Logistic回归[J].北方工业大学学报,2006,18(1):86-90. 被引量:1
  • 3Cornelis J,Biesheuvel,Ivar S.Genetic programming outperformed multivariable logistic regression in diagnosing pulmonary embolism.Journal of Clinical Epidemiology,2004,57:551-560.
  • 4Ivar S,Maarten K.Genetic programming as a method to develop powerful predictive models for clinical diagnosis.GECCO'052005,June,164166.
  • 5Milo E, Jeffrey AK. Use of genetic programming to diagnose venous thromboembolism in the emergency department. Genet Program Evolvable ,2008,9:39-51.
  • 6Li L,Jiang W,Li X.A robust hybrid between genetic algorithm and support vector machine for extracting an optimal feature gene subset.Genomics,2005,85:16-23.
  • 7Regeniter A,Freidank H,Dickenmann M.Evaluation of proteinuria and GFR to diagnose and classify kidney disease:Systematic review and proof of concept.European Journal of Internal Medicine,2009,20:556561.
  • 8Michalewicz Z,Genetic Algorithms+Data Structures=Evolution Programs.Berlin:Germany Springer,1989.
  • 9蔡煜东,陈德辉.运用遗传算法拟合Logistic曲线的研究[J].生物数学学报,1995,10(1):59-63. 被引量:13

二级参考文献20

  • 1罗登发,余松林.条件 logistic 回归模型的残差分析和影响诊断[J].中国卫生统计,1997,14(1):13-16. 被引量:7
  • 2潘辉,生物数学学报,1992年,3期,1页
  • 3Jing K D,1988年,3卷,121页
  • 4King Gary, Langche Zeng. Explaining Rare Events in International Relations. International Organization (in press), 2000
  • 5Manski Charles F, Steven R Lerman. 1977. The Estimation of Choice Probabilities from Choice Based Samples. Econometrica 1977-1988,45(8):55-56
  • 6Manski Charles F, Daniel McFadden. Alternative Estimators and Sample Designs for Discrete Choice Analysis."In Structural Analysis of Discrete Data with Econometric Applications, eds. Charles F.Manski and Daniel McFadden. Cambridge:MA:MIT Press. 1981
  • 7Prentice R L, R Pyke. 1979. Logistic Disease Incidence Models and Case-Control Studies. Biometrika ,1979, 66:403-411
  • 8Scott A J, Wild C J. Fitting Logistic Models Under Case-Control or Choice Based Sampling. Journal of the Royal Statistical Society, 1986, B 48(2):170-182
  • 9Xie Yu, Charles F Manski. The Logit Model and Response-Based Samples. Sociological Methods and Research ,1989, 17(3):283-302
  • 10Cosslett Stephen R. Maximum Likelihood Estimator for Choice-Based Samples. Econometrica, 1981, 49(5):1289-1316

共引文献51

同被引文献24

  • 1张利军,李战怀,王淼.基于位置信息的序列模式挖掘算法[J].计算机应用研究,2009,26(2):529-531. 被引量:12
  • 2Redon J, Olsen M H, Cooper R S, et al. To 2006 in 39 countries from Europe andCentral Asia: implications forcontrol of high bloodpressure. Eur Heart J, 1990; 2011 ; 32(2) :1424-1431.
  • 3Roger V L, Go A S, Lloyd-Jones D M, et al. Executive summary: heart disease and stroke statistics-2012 update a report from the American Heart Association. Circulation, 2012 ; 125 (3) : 188-197.
  • 4Doe C, Jethwa P R, Gandhi C D, et al. Strategies for asymptomatic carotid artery stenosis. Neurosurg Focus, 2011 ; 31(1) : 121-130.
  • 5de Weerd M, Greying J P, Hedblad B, et al. Prevalence of asympto- marie carotid artery stenosis in the general population an individual participant data recta-analysis. Stroke, 2010 ; 41 (2) : 1294-1307.
  • 6Liu P M, Liu G, Dosieah S, et al. Alcohol consumption and risk of stroke and coronary heart disease in eastern Asian men : a meta-analy- sis of prospective cohort studies. Heart, 2011 ; 35 ( 2 ) : 97-103.
  • 7Wang B Y, Wong C M, Wan F, et al. Trial pruning based on genetic algorithm for single-trial EEG classification. Comput Electr Eng, 2012; 38(1) : 35-'44.
  • 8Er O, Tanrikulu AC, Abakay A, et al. An approach based on prob- abilistic neural network for diagnosis of mesothelioma' s disease. Comput Electr Eng, 2012; 38(2) : 75-81.
  • 9Yu W C, Tan Z H, Wan Y. Guest editors introduction to the special issue on new trends in signal processing and biomedical engineering. Comput Electr Eng, 2012; 38(2) : 1-2.
  • 10Touw W'G, Bayjanov J R, Overmars L, et al. Data mining in the life sciences with random forest : a walk in the park or lostin the jun- gle. Brief Bioinformatics, 2012 ; 34 (2) :399--410.

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部