摘要
针对实时数据的在线处理问题,提出了一种基于Boosting的在线回归算法,通过对学习机适宜度置信区间的定义,建立了对概念漂移的实时判断方法,利用最新流入的数据块,及时对集成算法中的个体学习机进行逐一迭代更新,从而起到在线学习的效果。通过对标准数据库的数据建立仿真模型,验证这种在线回归算法可以与离线Boosting回归算法达到相似的精度,同时占用较少的存储记忆单元,提高学习速度,能够对学习机参数进行及时调整;该算法还可引入到工业生产中,对生产数据起到实时监控的作用。
An online regression algorithm based on Boosting is proposed for the purpose of handling the real time data. In the algorithm, measure of real time detecting on the concept drifting is used through the definition of confidenee interval for level of fitness, on the basis of the most recent data chunk, every base learning machine is updated one by one iteratively. So it has the effect of online learning. Through setting up models on the data of standard repository, the results show that this online method eould achieve similar aeeuraey to offline Boosting, meanwhile, oceupy less units for memory and adjust the parameter of the model timely. The algorithm also can be used in the industry for real time control.
出处
《计算机测量与控制》
CSCD
2008年第6期840-842,共3页
Computer Measurement &Control
基金
广东省科技攻关项目(2005B10201005)。
关键词
在线回归
集成算法
概念漂移
online regression
ensemble algorithm
coneept drifting