摘要
宽度学习系统(broad learning system,BLS)具有模型结构简单、训练效率高、易于解释等优势,但存在特征学习能力不足以及泛化性能不稳定的缺点.为缓解此问题,提出一种基于自注意力机制和微分跟踪器(tracking differentiator,TD)的宽度学习系统,记为A-TD-BLS.在模型结构上,A-TD-BLS在原始BLS模型的基础上引入了自注意力机制,通过注意力加权的方式对提取到的特征进行进一步的融合与变换,以提高原始BLS的特征学习能力.在训练算法上,提出一种基于TD的权重优化算法,通过限制权重值的大小有效地缓解了原始BLS模型的过拟合现象,显著降低了模型中隐藏层节点数量对模型性能的影响,使得模型泛化性能更加稳定.将该训练算法扩展到BLS模型的增量学习框架中,使得改进模型可以通过动态增加隐藏层节点的方式提升性能.在基准数据集上对A-TD-BLS进行多项试验,结果显示,相比原始BLS模型,在分类数据集上A-TD-BLS模型的分类准确率平均提升了1.27%,在回归数据集上A-TD-BLS模型的均方根误差平均降低了0.53,并且A-TD-BLS模型的泛化性能受隐藏层节点数量影响更小.A-TD-BLS模型提升了原始BLS模型泛化性能的稳定性,降低了模型性能对超参数的敏感程度,能够有效抑制过拟合现象.
Broad learning system(BLS)has advantages such as a simple model structure,high training efficiency,and easy interpretability.However,it also has drawbacks such as insufficient feature learning capability and unstable generalization performance.To alleviate these problems,broad learning system based on attention mechanism and tracking differentiator(TD),abbreviated as A-TD-BLS,was proposed.In terms of model structure,ATD-BLS introduced self-attention mechanism to the original BLS,and further fused and transformed the extracted features through attention weighting to improve the feature learning ability.In terms of model training methods,a weight optimization algorithm based on tracking differentiator was designed.This method effectively alleviates the overfitting phenomenon of the original BLS by limiting the size of the weight values,significantly reduces the influence of the number of hidden layer nodes on model performance and makes the generalization performance more stable.Moreover,the training algorithm was extended to the BLS incremental learning framework,so that the model can improve performance by dynamically adding hidden layer nodes.Multiple experiments conducted on some benchmark datasets show that compared to the original BLS,the classification accuracy of A-TD-BLS is increased by 1.27%on average on classification datasets and the root mean square error of A-TD-BLS is reduced by 0.53 on average on regression datasets.Besides,A-TD-BLS is less affected by the number of hidden layer nodes and has more stable generalization performance.Based on the above experimental results,it can be concluded that A-TD-BLS enhances the stability of generalization performance of the original BLS model,reduces the sensitivity of the model's generalization performance to hyperparameters,and effectively suppresses the phenomenon of overfitting.
作者
廖律超
邹伟东
杨佳龙
卢辉煌
夏元清
高建磊
LIAO Lüchao;ZOU Weidong;YANG Jialong;LU Huihuang;XIA Yuanqing;GAO Jianlei(Fujian Provincial Universities Key Laboratory of Industrial Control and Data Analysis,Fujian University of Technology,Fuzhou 350118,Fujian Province,P.R.China;School of Automation,Beijing Institute of Technology,Beijing 100081,P.R.China;Institute of Guarantee Technology,National Industrial Information Security Development Research Center,Beijing 100040,P.R.China)
出处
《深圳大学学报(理工版)》
CAS
CSCD
北大核心
2024年第5期583-593,共11页
Journal of Shenzhen University(Science and Engineering)
基金
国家自然科学基金资助项目(62376059)
福建省高校重点实验室开放基金资助项目(KF-18-23004)。
关键词
人工智能
宽度学习
自注意力机制
微分跟踪器
特征提取
增量学习
artificial intelligence
broad learning system
self-attention mechanism
tracking differentiator
feature extraction
incremental learning