摘要
当今时代大数据分析及其商业应用已成为研究热点,根据机器学习中集成学习的思想,从模型融合方面着手,研究提高模型融合准确率和鲁棒性的方法,设计了基于逻辑回归的二层模型融合算法,简称TMBLR算法,并将该算法应用于某商业软件的用户续购分析上.实验结果显示,该融合模型算法有更高的鲁棒性和更准确的预测结果,比使用单个基分类器的F1值高出2.05%;与常用的投票法相比,该算法的平均F1值高出1.1%,F1值的均方差值要低7.2‰,表明该算法稳定性更好;在该融合算法的第二层训练中,使用逻辑回归算法时的准确率、F1值和时间效率较高.
Big data analysis and commercial usage became a hot issue at present. According to ensemble learning method and model blending method in machine learning ,in order to improve the precision and robustness of the model, a Logistic Regression based two- level model blending algorithm is designed and used for user renewal behavior of business software. The experience result shows that the algorithm is robustness and high prediction result. The Fl-score of the new algorithm is 2.05 percentage higher than single basic classifier method. Comparing with the model blending algorithm based on voting, the TMBLR algorithm has higher average Fl-score with 1.1 percent and lower mean square deviation value to indicate higher stability with 7.2‰. The logistic regression algorithm achieve the best performance on precision and Fl-socre and time efficiency, when using different algorithms on the second-level blend in TMBLR algorithm.
出处
《小型微型计算机系统》
CSCD
北大核心
2017年第10期2231-2235,共5页
Journal of Chinese Computer Systems
基金
国家自然科学基金项目(F020107)资助
关键词
大数据
机器学习
集成学习
模型融合
逻辑回归
big data
machine learning
ensemble learning
model blending
logistic regression