摘要
为了更好地拟合偏态数据,充分提取偏态数据的信息,针对偏正态数据建立了众数回归模型,并基于Pena距离统计量对众数回归模型进行统计断研究,得到了众数回归模型的Pena距离表达式以及高杠杆异常点的诊断方法.利用EM算法与梯度下降法给出了众数回归模型参数的极大似然估计,根据数据删除模型计算似然距离、Cook距离和Pena距离统计量,绘制诊断统计图.通过Monte Carlo模拟试验和实例分析比较,说明文章提出的方法行之有效,并在一定条件下Pena距离对异常点或强影响点的诊断优于似然距离和Cook距离.
In order to better fit the skewed data and fully extract the information of the skewed data,this paper establishes a mode regression model for the skew-normal data,and makes a statistical study of the mode regression model based on the Pena distance statistics.The expression of Pena distance of mode regression model and the diagnosis method of high leverage outliers are obtained.In this paper,the maximum likelihood estimation of mode regression model parameters is given by using EM algorithm and gradient descent method.According to the deleted data model,the measurement of likelihood distance,Cook distance and Pena distance is calculated,and the diagnosis statistics chart is drawn.Monte Carlo simulation experiments and two real examples analysis are compared to show that the method proposed in this paper is effective.Under certain conditions,Pena distance is superior to likelihood distance and Cook distance in the diagnosis of outliers or strong influence points.
作者
曹幸运
曾鑫
吴刘仓
CAO Xing-yun;ZENG Xin;WU Liu-cang(Faculty of Science,Kunming University of Science and Technology,Kunming 650093,China)
出处
《高校应用数学学报(A辑)》
北大核心
2021年第1期9-20,共12页
Applied Mathematics A Journal of Chinese Universities(Ser.A)
基金
国家自然科学基金(11861041,11261025)
昆明理工大学学生学术科技创新基金(2020YB208)。
关键词
偏正态分布
众数回归模型
Pena距离
EM算法
梯度下降法
skew-normal distribution
mode regression model
Pena distance
EM algorithm
gradient descent method