期刊文献+

基于最优子抽样的大数据泊松回归系数估计

Estimation of Poisson regression coefficients for big data based on optimal subsampling
下载PDF
导出
摘要 为了快速且准确地求解泊松回归估计量,提出建立在最优子抽样算法基础上的泊松回归模型。通过证明子样本估计量的渐近性质,提出了两步最优子抽样算法,并根据A-最优性思想和L-最优性准则设计了两种抽样概率下的抽样方法。性能对比试验显示,本文提出的最优子抽样算法的均方误差显著低于其他方法;运行时间对比显示,L-最优性准则对应抽样概率的抽样方法比A-最优性思想对应抽样算法在估计回归系数上运行时间更少;超大样本和维度下,最优子抽样算法在两种维度中的运行时间平均比杠杆子抽样算法分别减少了61.84%、70.64%。以上结果表明,所提出的最优子抽样算法基础上的泊松回归可有效逼近全部数据下的最大似然估计,在估计回归系数上更具有优越性。 In order to solve the Poisson regression estimates quickly and accurately,this study proposes a Poisson regression model based on the optimal subsampling algorithm.By proving the asymptotic nature of the subsample estimator,a two-step optimal subsampling algorithm is proposed,and a sampling method under two sampling probabilities is designed according to the A-optimality idea and the L-optimality criterion.Performance comparison tests show that the mean square error values of the optimal subsampling algorithm proposed in the study are significantly lower than those of the other methods.The running time comparison shows that the sampling method corresponding to the sampling probability of the L-optimality criterion has less running time in estimating the regression coefficients than the sampling probability corresponding to the A-optimality idea.Comparison of the running time in oversized samples and dimensions shows that the optimal subsampling algorithm reduces the running time in both dimensions by 61.84%and 70.64%on average compared to the leveraged subsampling algorithm,respectively.The above results show that the Poisson regression based on the optimal sub-algorithm proposed in the study can effectively approximate the maximum likelihood estimation under the full data,and is more superior in performing the estimation of regression coefficients.
作者 温雪俊 WEN Xuejun(Shanxi University of Electronic Science and Technology,Linfen 041000,China)
出处 《山东理工大学学报(自然科学版)》 CAS 2024年第6期59-64,共6页 Journal of Shandong University of Technology:Natural Science Edition
关键词 最优子抽样 渐近性质 泊松回归 运行时间 均方误差 optimal subsampling asymptotic properties Poisson regression running time mean square error
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部