传统Pearson相关系数计算公式具有不稳健性,离群值的存在会导致计算结果与实际不符。针对此问题,文章给出了一种稳健估计方法。在模拟样本量分别为20、50、100、200,污染率分别为1%、5%、10%情形下,比较传统相关系数值与稳健相关系数值...传统Pearson相关系数计算公式具有不稳健性,离群值的存在会导致计算结果与实际不符。针对此问题,文章给出了一种稳健估计方法。在模拟样本量分别为20、50、100、200,污染率分别为1%、5%、10%情形下,比较传统相关系数值与稳健相关系数值,发现:稳健相关系数公式正确率均显著高于传统相关系数。在实例分析中进一步验证了稳健相关系数的可行性和有效性。文章研究结论可用于含离群值变量的相关系数稳健估计。The traditional Pearson correlation coefficient calculation formula is not robust, and the existence of outliers will cause the calculation results to be inconsistent with reality. To solve this problem, this paper presents a robust estimation method. When the simulated sample size is 20, 50, 100 and 200 respectively, the pollution rate is 1%, 5% and 10% respectively, it is found that the accuracy of the robust correlation coefficient formula is significantly higher than that of the traditional correlation coefficient. The feasibility and effectiveness of a robust correlation coefficient are further verified in the example analysis. The conclusions of this paper can be used for robust estimation of correlation coefficients with outlier variables.展开更多
文摘传统Pearson相关系数计算公式具有不稳健性,离群值的存在会导致计算结果与实际不符。针对此问题,文章给出了一种稳健估计方法。在模拟样本量分别为20、50、100、200,污染率分别为1%、5%、10%情形下,比较传统相关系数值与稳健相关系数值,发现:稳健相关系数公式正确率均显著高于传统相关系数。在实例分析中进一步验证了稳健相关系数的可行性和有效性。文章研究结论可用于含离群值变量的相关系数稳健估计。The traditional Pearson correlation coefficient calculation formula is not robust, and the existence of outliers will cause the calculation results to be inconsistent with reality. To solve this problem, this paper presents a robust estimation method. When the simulated sample size is 20, 50, 100 and 200 respectively, the pollution rate is 1%, 5% and 10% respectively, it is found that the accuracy of the robust correlation coefficient formula is significantly higher than that of the traditional correlation coefficient. The feasibility and effectiveness of a robust correlation coefficient are further verified in the example analysis. The conclusions of this paper can be used for robust estimation of correlation coefficients with outlier variables.