期刊文献+

基于大规模数据尾期望回归的分布式计算方法 被引量:1

Distributed Computing Method of Tail-expectation Regression Based on Large-scale Data
下载PDF
导出
摘要 大规模数据是需要新处理模式才能具有更强的洞察力和决策力的海量、高增长率和多样化的信息资产。分析海量数据的工作异常复杂,主要面临两个挑战:数据的难存储性和偏态性。基于此,文章主要研究以下两个问题:(1)将数据进行分布式存储,减轻单台机器的存储负担,采用尾期望回归分析偏态数据。(2)基于尾期望回归构造全局损失函数的一个交互有效的梯度增强型损失函数,为解决该损失函数的优化问题,提出修正的ADMM算法。模拟研究表明,在有限次主从机器之间交互次数下,提出的分布式计算方法得到的估计误差递减并趋于全局最优方法得到的估计误差。基于全国健康访谈调查(NHIS)数据的实证研究表明,提出的分布式计算方法对国民体重具有良好的预测性能。 Large-scale data requires new processing modes before possessing massive, high-growth, and diversified information assets with greater insight and decision-making power. Analyzing massive amounts of data is incredibly complex and presents two major challenges: data storage difficulty and skewness. On this basis, this paper mainly studies the following two issues:(1)The data is stored in a distributed way to reduce the storage burden of a single machine, and the tail-expectation regression is used to analyze skewed data.(2) An interactive efficient gradient enhanced loss function of global loss function is constructed based on tail-expectation regression. In order to solve the optimization problem of the loss function, a modified ADMM algorithm is proposed. Simulation study shows that in a finite number of master-slave machine interactions, the estimation error of the proposed distributed computing method decreases progressively and verges to the estimation error obtained by the global optimal method.The empirical study based on data from National Health Interview Survey(NHIS) shows that the proposed distributed calculation method has a good performance in predicting national citizens’ weight.
作者 潘莹丽 刘飞 刘展 赵晓洛 Pan Yingli;Liu Fei;Liu Zhan;Zhao Xiaoluo(School of Mathematics and Statistics,Hubei University,Wuhan 430062,China;School of Law,Huazhong University of Science and Technology,Wuhan 430074,China;School of Sports Science and Technology,Wuhan Sports University,Wuhan 430079,China)
出处 《统计与决策》 CSSCI 北大核心 2022年第12期11-16,共6页 Statistics & Decision
基金 国家自然科学基金资助项目(11901175)。
关键词 大规模数据 尾期望回归 分布式计算 修正的ADMM算法 NHIS large-scale data tail-expectation regression distributed computing modified ADMM algorithm NHIS
  • 相关文献

参考文献2

二级参考文献3

共引文献7

同被引文献3

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部