期刊文献+

迁移近邻传播聚类算法 被引量:17

Transfer Affinity Propagation Clustering Algorithm
下载PDF
导出
摘要 在目标域可利用数据匮乏的场景下,传统聚类算法的性能往往会下降.在该场景下,通过抽取源域中的有用知识用于指导目标域学习以得到更为合适的类别信息和聚类性能,是一种有效的学习策略.借此提出一种基于近邻传播的迁移聚类(transfer affinity propagation,简称TAP)算法,在源域和目标域数据分布相似的情况下,通过引入迁移学习机制来改善近邻传播聚类(affinity propagation,简称AP)算法在数据匮乏场景下的聚类性能.为保证迁移的有效性,TAP在综合考虑源域和目标域的统计特性及几何特征的基础上改进AP算法中的消息传递机制使其具备迁移能力,从而达到辅助目标域学习的目的.此外,通过TAP对应的因子图,亦可说明TAP可以以类似AP的消息传递机制,在目标域数据匮乏的情况下进行高效的知识迁移,为最终所获得的聚类结果提供了保证.在模拟数据集和真实数据集上的仿真实验结果显示,所提出的算法较之经典AP算法在处理非充分数据聚类任务时具有更佳的性能. The main limitation of most traditional clustering methods is that they cannot effectively deal with the insufficient datasets in target domain, It is desirable to develop new cluster algorithms which can leverage useful information in the source domain to guide the clustering performance in the target domain so that appropriate number of clusters and high quality clustering result can be obtained in this situation. In this paper, a clustering algorithm called transfer affinity propagation (TAP) is proposed for the insufficient dataset scenarios. The new algorithm improves the clustering performance when the distribution of source and target domains are similar. The basic idea of TAP is to modify the update rules about two message propagations, used in affinity propagation (AP), through leveraging statistical property and geometric structure together. With the corresponding factor graph, TAP indeed can be applied to cluster in AP-like transfer learning, i.e., TAP earl abstract the knowledge of source domains through the two tricks to enhance the learning of target domain even if the data in the current scene are not adequate. Extensive experiments demonstrate that the proposed algorithm outperforms traditional algorithms in situations of insufficient data.
作者 杭文龙 蒋亦樟 刘解放 王士同 HANG Wen-Long JIANG Yi-Zhang LIU Jie-Fang WANG Shi-Tong(School of Digital Media, Jiangnan University, Wuxi 214122, China)
出处 《软件学报》 EI CSCD 北大核心 2016年第11期2796-2813,共18页 Journal of Software
基金 国家自然科学基金(61272210 61202311 61300151) 江苏省自然科学基金(BK2012552 BK20130155)~~
关键词 迁移学习 统计特征 几何结构 近邻传播 聚类方法 非充分数据 transfer learning statistical property geometric structure affinity propagation (AP) cluster method insufficient data
  • 相关文献

参考文献38

  • 1Frey BJ, Dueck D. Clustering by passing messages between data points.Science, 2007, 315(5814):972–976.[doi:10.1126/science.1136800].
  • 2Kschischang FR, Frey BJ, Loeliger HA. Factor graphs and the sum-product algorithm.IEEE Trans. on Information Theory, 2001, 47(2):498–519.[doi:10.1109/18.910572].
  • 3McQueen JB. Some methods for classification and analysis of multivariate observations. In:Proc. of the 5th Berkeley Symp. on Mathematical Statistics and Probability. Berkeley:University of California Press, 1967. 281-297.
  • 4Kohonen T. Self-Organizing Feature Maps. Berlin, Heidelberg:Springer-Verlag, 1989.[doi:10.1007/978-3-642-88163-3_5].
  • 5Dueck D, Frey BJ, Jojic N, Jojic V, Giaever G, Emili A, Musso G, Hegele R. Constructing treatment portfolios using affinity propagation. In:Proc. of the 12th Annual Int’l Conf. on Research in Computational Molecular Biology (RECOMB 2008). 2008. 360-371.[doi:10.1007/978-3-540-78839-3_31].
  • 6Jain AK. Data clustering:50 years beyond K-means.Pattern Recognition Letters, 2010, 31(8):651–666.[doi:10.1016/j.patrec.2009.09.011].
  • 7Sumedha ML, Weigt M. Unsupervised and semi-supervised clustering by message passing:Soft-Constraint affinity propagation. The European Physical Journal B, 2008, 66(1):125–135.[doi:10.1140/epjb/e2008-00381-8].
  • 8Xiao J, Wang J, Tan P, Quan L. Joint affinity propagation for multiple view segmentation. In:Proc. of the 11th IEEE Int’l Conf. on Computer Vision. IEEE Press, 2007. 1-7.[doi:10.1109/ICCV.2007.4408928].
  • 9Xiao Y, Yu J. Semi-Supervised clustering based on affinity propagation algorithm.Ruan Jian Xue Bao/Journal of Software, 2008, 19(11):2803–2813(in Chinese with English abstract).[doi:10.3724/SP.J.1001.2008.02803] http://www.jos.org.cn/ch/reader/view_abstract.aspx?flag=1&file_no=20081103&journal_id=jos.
  • 10Pan SJ, Yang Q. A survey on transfer learning.IEEE Trans. on Knowledge and Data Engineering, 2010, 22(12):1245–1359.[doi:10.1109/TKDE.2009.191].

二级参考文献88

  • 1邓赵红,王士同,吴锡生,胡德文.鲁棒的极大熵聚类算法RMEC及其例外点标识[J].中国工程科学,2004,6(9):38-45. 被引量:12
  • 2王熙照,安素芳.基于极大模糊熵原理的模糊产生式规则中的权重获取方法研究[J].计算机研究与发展,2006,43(4):673-678. 被引量:7
  • 3王玲,薄列峰,焦李成.密度敏感的半监督谱聚类[J].软件学报,2007,18(10):2412-2422. 被引量:94
  • 4Ben-David S,Blitzer J,Crammer K,Pereira F.Analysis of representations for domain adaptation.In:Platt JC,Koller D,Singer Y,Roweis ST,eds.Proc.of the Advances in Neural Information Processing Systems 19.Cambridge:MIT Press,2007.137-144.
  • 5Blitzer J,McDonald R,Pereira F.Domain adaptation with structural correspondence learning.In:Jurafsky D,Gaussier E,eds.Proc.of the Int’l Conf.on Empirical Methods in Natural Language Processing.Stroudsburg PA:ACL,2006.120-128.
  • 6Dai WY,Xue GR,Yang Q,Yu Y.Co-Clustering based classification for out-of-domain documents.In:Proc.of the 13th ACM Int’l Conf.on Knowledge Discovery and Data Mining.New York:ACM Press,2007.210-219.[doi:10.1145/1281192.1281218].
  • 7Dai WY,Xue GR,Yang Q,Yu Y.Transferring naive Bayes classifiers for text classification.In:Proc.of the 22nd Conf.on Artificial Intelligence.AAAI Press,2007.540-545.
  • 8Liao XJ,Xue Y,Carin L.Logistic regression with an auxiliary data source.In:Proc.of the 22nd lnt*I Conf.on Machine Learning.San Francisco:Morgan Kaufmann Publishers,2005.505-512.[doi:10.1145/1102351.1102415].
  • 9Xing DK,Dai WY,Xue GR,Yu Y.Bridged refinement for transfer learning.In:Proc.of the Ilth European Conf.on Practice of Knowledge Discovery in Databases.Berlin:Springer-Verlag,2007.324-335.[doi:10.1007/978-3-540-74976-9_31].
  • 10Mahmud MMH.On universal transfer learning.In:Proc.of the 18th Int’l Conf.on Algorithmic Learning Theory.Sendai,2007.135-149.[doi:10,1007/978-3-540-75225-7_14].

共引文献641

同被引文献104

引证文献17

二级引证文献61

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部