摘要
针对传统Stacking算法手动选择基学习器存在效率低和无法选择最优基学习器的问题,提出一种基于动态聚类的Stacking算法并将其应用于销量预测任务中。首先,通过轮廓系数法对多个初始基学习器的输出以不同的簇数计算其轮廓系数值;然后,动态选择系数值最大时的簇数进行k-means聚类,每轮聚类后根据各簇心与标签值的误差给予回报值奖励;最后,选择回报值最大的簇所包含的基学习器作为最优基学习器。实验结果表明,所提算法与基于特征融合的Stacking算法相比,均方根百分比误差(RMSPE)降低了1.3个百分点,平均绝对百分比误差(MAPE)降低了1.0个百分点;与基于层次分析的Stacking算法相比,RMSPE降低了1.1个百分点,MAPE降低了0.8个百分点。
Aiming at the problems of low efficiency and inability to select the optimal base learners for the traditional Stacking algorithm to manually select the base learners,a Stacking algorithm based on dynamic clustering was proposed and applied to the sales forecast task.Firstly,the silhouette coefficient values were calculated for the outputs of multiple initial base learners with different numbers of clusters by the silhouette coefficient method.Then,the cluster number with the largest coefficient value was dynamically selected for k-means clustering.After each round of clustering,the reward value was given according to the error between each cluster center and the label value.Finally,the base learner contained in the cluster with the largest reward value was selected as the optimal base learner.The experimental results show that the proposed algorithm reduces the Root Mean Square Percentage Error(RMSPE)by 1.3 percentage points and the Mean Absolute Percentage Error(MAPE)by 1.0 percentage points compared with the Stacking algorithm based on feature fusion;reduced the RMSPE by 1.1 percentage points and the MAPE by 0.8 percentage points compared with the Stacking algorithm based on analytic hierarchy process.
作者
张晏
鲍胜利
王啸飞
ZHANG Yan;BAO Shengli;WANG Xiaofei(Chengdu Institute of Computer Application,Chinese Academy of Sciences,Chengdu Sichuan 610041,China;University of Chinese Academy of Sciences,Beijing 100049,China)
出处
《计算机应用》
CSCD
北大核心
2022年第S02期100-104,共5页
journal of Computer Applications
基金
四川省科技计划项目(2020YFQ0056)
中国科学院西部青年学者项目(RRJZ2021003)。