期刊文献+

基于自适应密度聚类的多准则主动学习方法

A multi-criteria active learning method based on adaptive density clustering
下载PDF
导出
摘要 主动学习能够以更少的标注成本训练出更好的机器学习模型。现有的RD算法与QBC算法的结合有效地解决了只考虑单一标准的问题。然而,RD所基于的K-means聚类会将离群点也包括在内进而造成模型性能降低,而QBC则需要维护于多个模型而间接返回样本的信息性.针对上述问题,本文提出了一种基于自适应密度聚类的高斯过程回归(ADC-GPR)算法,通过先聚类后直接利用不确定性进而高效选择样本。该算法中的ADC聚类不仅对离群点鲁棒,还能根据数据集分布特性自适应聚类,并为后续的AL提供了代表性样本点和其对应的簇,该方法在无监督选择时保证了代表性和多样性,在有监督选择时考虑了信息性、代表性和多样性。实验结果表明,在相同的抽样次数下将ADC-GPR算法与RS、KS以及RD-GPR算法相比,其平均性能分别提升了37.3%、8%和2.8%,ADC-GPR算法的选择效率更高。 Active learning proves instrumental in training superior machine learning models while minimizing labeling costs.The combination of RD and QBC algorithms effectively addresses issues associated with considering only a single criterion.However,the K-means clustering upon which RD is based may include outliers,leading to a decrease in model performance,and QBC requires maintaining multiple models and indirectly provides sample information.To address these issues,we propose an adaptive density clustering-based Gaussian process regression(ADC-GPR)algorithm,which efficiently selects samples by first clustering and then utilizing uncertainty directly.The ADC clustering in this algorithm is not only robust against outliers but also adapts to the distribution characteristics of the dataset,providing representative sample points and their corresponding clusters for subsequent AL.This method ensures both representativeness and diversity in unsupervised selection and considers informativeness,representativeness,and diversity in supervised selection.The experimental results demonstrate that compared to the RS,KS,and RD-GPR algorithms,the ADC-GPR algorithm exhibits an average performance improvement of 37.3%,8%,and 2.8%respectively,with the same number of sampling iterations.Furthermore,the ADC-GPR algorithm demonstrates higher selection efficiency.
作者 贺忠海 朱温涵 陈旭旺 张晓芳 He Zhonghai;Zhu Wenhan;Chen Xuwang;Zhang Xiaofang(School of Control Engineering,Northeastern University at Qinhuangdao,Qinhuangdao 066004,China;Hebei Key Laboratory of Micro-Nano Sensing,Qinhuangdao 066004,China;School of Optoelectronics,Beijing Institute of Technology,Beijing 100081,China)
出处 《仪器仪表学报》 EI CAS CSCD 北大核心 2024年第3期179-187,共9页 Chinese Journal of Scientific Instrument
基金 河北省自然科学基金(F2020501040)项目资助。
关键词 主动学习 自适应密度聚类 高斯过程回归 离群点鲁棒 多标准融合 active learning adaptive density clustering Gaussian process regression outlier robustness multi-criteria fusion
  • 相关文献

参考文献2

二级参考文献7

共引文献9

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部