摘要
针对数据中广泛存在的异常值会扭曲贝叶斯LASSO方法的参数估计和变量选择结果的问题,通过引入异方差扰动的先验设定,借此提升贝叶斯LASSO方法的稳健性,并推导出各参数的后验分布,利用Gibbs抽样得到其估计值与置信区间.该方法在数值模拟中表现出较低的拟合误差与较高的变量识别准确率,对糖尿病数据集和血浆β-胡萝卜素水平数据集的分析表明该方法能达到简化模型与减少预测误差的平衡,实现稳健的变量选择与系数估计,并对数据中可能包含的异常值与异方差扰动有良好的抑制作用.
Given that the ubiquitous outliers in the data can distort the parameter estimation and variable selection results of Bayesian LASSO,the prior information of heteroscedastic disturbances is introduced to improve the robustness of Bayesian LASSO.The posterior distribution of each parameter is derived,and the estimation and confidence interval of each parameter are obtained by Gibbs sampling.The method exhibits low fitting error and high variable identification accuracy in numerical simulation,and the analyses of diabetes dataset and Plasma Beta-Carotene Level Dataset show that the proposed method achieves the balance between simplifying model and reducing prediction error.The proposed method can realize robust variable selection and coefficient estimation and has a good inhibitory effect to outliers and heteroscedastic disturbances that may be included in the data.
作者
梁韵婷
张辉国
胡锡健
LIANG Yunting;ZHANG Huiguo;HU Xijian(College of Mathematics and System Science,Xinjiang University,Urumqi 830046,China)
出处
《西南师范大学学报(自然科学版)》
CAS
2023年第8期33-40,共8页
Journal of Southwest China Normal University(Natural Science Edition)
基金
国家自然科学基金项目(11961065)
教育部人文社会科学研究规划基金项目(19YJA910007)
新疆自然科学基金项目(2019D01C045).