摘要
提出一种基于自步学习的对称非负矩阵分解算法,通过误差驱动的方式使模型更好地区分正常样本与异常样本,进而提高模型的聚类性能。该方法为所有样本赋予了一个可以衡量其难易程度的权重变量,并采用硬加权与软加权两种策略分别对此变量进行约束以保证模型的合理性。在图像、文本等多个数据集上进行聚类分析,实验结果表明了所提算法的有效性。
A symmetric nonnegative matrix factorization algorithm based on self-paced learning was proposed to improve the clustering performance of the model.It could make the model better distinguish normal samples from abnormal samples in an error-driven way.A weight variable that could measure the degree of difficulty to all samples was assigned in this method,and the variable was constrained by adopting both hard-weighting and soft-weighting strategies to ensure the rationality of the model.Cluster analysis was carried out on multiple data sets such as images and texts,and the experimental results showed the effectiveness of the proposed algorithm.
作者
王雷
杜亮
周芃
吴鹏
WANG Lei;DU Liang;ZHOU Peng;WU Peng(College of Computer and Information Technology,Shanxi University,Taiyuan 030006,China;Institute of Big Data Science and Industry,Shanxi University,Taiyuan 030006,China;College of Computer Science and Technology,Anhui University,Hefei 230601,China)
出处
《郑州大学学报(理学版)》
CAS
北大核心
2022年第5期43-48,共6页
Journal of Zhengzhou University:Natural Science Edition
基金
国家自然科学基金项目(61502289,61806003)
山西省重点研发项目(201803D31199)
山西省自然科学基金项目(201801D221163,201801D221173)。
关键词
无监督学习
对称非负矩阵分解
误差驱动
自步学习
聚类
unsupervised learning
symmetry nonnegative matrix factorization
error-driven
self-paced learning
clustering