摘要
目的在变系数模型中比较七种常见的稳健估计方法与最小二乘法的表现,为变系数模型中估计方法的选择提供依据。方法通过R软件随机模拟,以变系数模型产生数据并对其进行污染,比较稳健估计方法和最小二乘法估计结果的偏差、方差、均方误差以及积分均方误差的差异。结果当数据存在扰动时,尤其是存在X方向上的异常点时,M-Huber、最小绝对离差(least absolute deviation,LAD)估计、MM以及R这几种稳健方法的四项指标几乎都小于最小二乘法,其中,MM表现最好。而最小截断平方法(least trimmed squares,LTS)、最小中位数平方法(least median of squares,LMS)以及S由于在R软件中稳定性较差,并不适用于变系数模型。结论在变系数模型中,当有异常点存在时,采用MM估计能得到更加准确的结果。
Objective To compare the performance of several common robust methods and Ordinary Least Square (OLS) in varying coefficient model. Methods We used R software to simulate uncontaminated data and contaminated data. Bias, variance, mean square error (MSE) and integrated mean square error (IMSE) were used for the evaluation indices to compare the performance of these robust methods and OLS. Results When outliers were present, especially occured in x-space, M-Huber, LAD (Least Absolute Deviation), MM and R performed much better than OLS with smaller Bias, variance, MSE and IMSE in almost all cases. Among them, MM performed best overall against a comprehensive set of outlier conditions. Furthermore, LTS (Least Trimmed Squares), LMS (Least Median of Squares)and S did not seem to apply in varying coefficient model for their instability in R software. Conclusion When outliers occured, MM resulted in more accurate results in varying coefficient model.
出处
《中国卫生统计》
CSCD
北大核心
2016年第4期554-558,共5页
Chinese Journal of Health Statistics
基金
国家自然科学基金(11371100)
关键词
变系数模型
稳健
异常点
Varying coefficient model
Robustness
Outlier