期刊文献+

数据增强和自适应自步学习的深度子空间聚类算法

Deep Subspace Clustering Algorithm with Data Augmentation and Adaptive Self-Paced Learning
下载PDF
导出
摘要 深度子空间聚类通过联合执行自表达特征学习和聚类分配而取得了比传统聚类更好的性能。尽管在各种应用中出现了大量的深度子空间聚类算法,但是多数算法都无法学习到精准的面向聚类的特征。针对深度子空间聚类方法在学习聚类的特征表示时不够精准、影响最终聚类性能等问题,提出一种改进的深度子空间聚类算法。通过随机移位和旋转对原样本进行数据增强,交替地使用增强样本来训练和优化自编码器,同时更新样本的集群分配,从而学习到更稳健的特征表示。在微调阶段,损失函数中每个增强样本的目标都是将原样本分配到集群中心,目标计算可能出错,目标错误的样本会误导自编码器网络训练,为此,利用一种无需额外超参数的自适应自步学习算法,在每次迭代中选择最具说服力的样本来提高泛化能力。在MNIST、USPS、COIL100数据集上进行实验,结果表明,该算法的准确率分别达到0.931 8、0.893 4、0.723 6,消融实验和敏感性分析结果也验证了算法的有效性。 Deep subspace clustering achieves better performance than traditional clustering by jointly performing self-expressed feature learning and cluster allocation.Despite the emergence of a large number of deep subspace clustering algorithms in various applications,most algorithms are unable to learn accurate clustering-oriented features.In this study,an improved deep subspace clustering algorithm is proposed to address issues such as insufficient accuracy in learning the feature representation of clustering,which affects the final performance of deep subspace clustering methods.Random displacement and rotation are used to enhance the original sample data,whereby the autoencoder is trained and optimized by alternately using enhanced samples while updating the cluster allocation of samples to learn more robust feature representations.In the fine-tuning phase,the goal is for each enhanced sample in the loss function,to allocate the original sample to the cluster center.The target calculation may be wrong,and the sample with the wrong target will mislead the self-encoder network training.Therefore,an adaptive self-paced learning algorithm without additional hyperparameters is used to select the most convincing sample in each iteration to improve generalization ability.Experiments were conducted on the MNIST,USPS,and COIL100 datasets,and the results showed that the accuracy of the algorithm reached 0.9318,0.8934,and 0.7236,respectively.The ablation experiment and sensitivity analysis results also verified the effectiveness of the algorithm.
作者 江雨燕 陶承凤 李平 JIANG Yuyan;TAO Chengfeng;LI Ping(School of Management Science and Engineering,Anhui University of Technology,Maanshan 243032,Anhui,China;School of Computer Science,Nanjing University of Posts and Telecommunications,Nanjing 210023,China)
出处 《计算机工程》 CAS CSCD 北大核心 2023年第8期96-103,110,共9页 Computer Engineering
基金 国家自然科学基金(62006126)。
关键词 深度学习 子空间聚类 数据增强 自适应自步学习 编码器 deep learning subspace clustering data augmentation adaptive self-paced learning encoder
  • 相关文献

参考文献5

二级参考文献43

  • 1张惟皎,刘春煌,李芳玉.聚类质量的评价方法[J].计算机工程,2005,31(20):10-12. 被引量:60
  • 2钱线,黄萱菁,吴立德.初始化K-means的谱方法[J].自动化学报,2007,33(4):342-346. 被引量:32
  • 3袁方,周志勇,宋鑫.初始聚类中心优化的k-means算法[J].计算机工程,2007,33(3):65-66. 被引量:152
  • 4盛骤,谢式千,潘承毅.概率论与数理统计[M].2版.北京:高等教育出版社,1997:18-28.
  • 5Han Jiawei,Kamber M.Data Mining:Concepts and Techniques[M].2nd ed.Beijing,China:China Machine Press,2011.
  • 6Pena J M,Lozano J A,Larranaga P.An Empirical Comparison of Four Initialization Methods for the K Means Algorithm[J].Pattern Recognition Letters,1999,20(10):1027-1040.
  • 7Vance F.Clustering and the Continuous K-Means Algorithm[J].Los Alamos Science,1994,22:138-134.
  • 8Jain A K,Murty M N,Flynn P J.Data Clustering:A Review[J].ACM Computing Survey,1999,31 (3):264-323.
  • 9Kaufman L,Rousseeuw P J.Finding Groups in Data:An Introduction to Cluster Analysis[M].New York,USA:John Wiley & Sons,Inc.,1990.
  • 10Dhillon I S,Guan Yuqiang,Kogan J.Refining Clusters in High Dimensional Text Data[C]//Proceedings of the 2nd SIAM Workshop on Clustering High Dimensional Data.Arlington,USA:[s.n.],2002:59-66.

共引文献112

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部