期刊文献+

一种自适应在线核密度估计方法 被引量:1

Adaptive Online Kernel Density Estimation Method
下载PDF
导出
摘要 给定一组观察数据,估计其潜在的概率密度函数是统计学中的一项基本任务,被称为密度估计问题.随着数据收集技术的发展,出现了大量的实时流式数据,其特点是数据量大,数据产生速度快,并且数据的潜在分布也可能随着时间而发生变化,对这类数据分布的估计也成为亟待解决的问题.然而,在传统的密度估计算法中,参数式算法因为有较强的模型假设导致其表达能力有限,非参数式算法虽然具有更好的表达能力,但其计算复杂度通常很高.因此,它们都无法很好地应用于这种流式数据的场景.通过分析基于竞争学习的学习过程,提出了一种在线密度估计算法来完成流式数据上的密度估计任务,并且分析了其与高斯混合模型之间的密切联系.最后,将所提算法与现有的密度估计算法进行对比实验.实验结果表明,与现有的在线密度估计算法相比,所提算法能够取得更好的估计结果,并且能够基本上达到当前最好的离线密度估计算法的估计性能. Based on observed data,density estimation is the construction of an estimate of an unobservable underlying probability density function.With the development of data collection technology,real-time streaming data becomes the main subject of many related tasks.It has the properties of that high throughput,high generation speed,and the underlying distribution of data may change over time.However,for the traditional density estimation algorithms,parametric methods make unrealistic assumptions on the estimated density function while non-parametric ones suffer from the unacceptable time and space complexity.Therefore,neither parametric nor non-parametric ones could scale well to meet the requirements of streaming data environment.In this study,based on the analysis of the learning strategy in competitive learning,it is proposed a novel online density estimation algorithm to accomplish the task of density estimation for such streaming data.And it is also pointed out that it has pretty close relationship with the Gaussian mixture model.Finally,the proposed algorithm is compared with the existing density estimation algorithms.The experimental results show that the proposed algorithm could obtain better estimates compared with the existing online algorithm,and also get comparable estimation performance compared with state-of-the-art offline density estimation algorithms.
作者 邓齐林 邱天宇 申富饶 赵金熙 DENG Qi-Lin;QIU Tian-Yu;SHEN Fu-Rao;ZHAO Jin-Xi(State Key Laboratory for Novel Software Technology(Nanjing University),Nanjing 210023,China;Department of Computer Science and Technology,Nanjing University,Nanjing 210023,China)
出处 《软件学报》 EI CSCD 北大核心 2020年第4期1173-1188,共16页 Journal of Software
基金 国家自然科学基金(61876076) 江苏省自然科学基金(BK20171344)。
关键词 密度估计 高斯混合模型 数据流 在线学习 竞争学习 density estimation Gaussian mixture model data stream online learning competitive learning
  • 相关文献

参考文献2

二级参考文献12

共引文献336

同被引文献14

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部