期刊文献+

余弦距离下保护型迁移学习聚类算法 被引量:1

Protection-type transfer learning clustering algorithm with cosine distance metric
下载PDF
导出
摘要 以往研究者都从公式的合理性出发研究迁移学习和传统机器学习,但他们忽视了对问题的整体性考虑,致使在具体应用到文本分类问题时,无法实现彻底的分类。通过研究文本分类的整个过程,在k-均值算法中使用余弦距离,显著提高了实验结果;提出保护型迭代思想,同时弃用传统的词特征空间,采用隐空间作为特征向量空间,实施归一化约束。以CCI算法为例,结合提出的改进思想,产生改进算法PCCI,在降低计算复杂度的同时显著提高迁移学习的分类正确率。通过在数据集20-News Groups和Reuters-21578上测试并与现有其他迁移学习算法进行比较,证明了该改进算法的优越性。 Former researchers commonly study transfer learning algorithms and traditional machine learning from the point of the rationality of formulas, while neglecting the integrality of the problem. As a result, their algorithms are usually unable to thoroughly practice classification when they are applied to specific text classification problem. Via observing the whole process of text classification, it uses cosine distance in k-mean method and gets obviously better results. It proposes protection-type iteration idea. It abandons traditional word feature space and chooses hidden space as the feature vector space and implements normalization constraints. Taking CCI algorithm as an example, this idea is used to create an improved algorithm which is nominated PCCI. This algorithm can prominently raise the classification accuracy of transfer learning, meanwhile reducing the computing complexity. It proves the superiority of the improved algorithm by comparing with other former transfer learning cases through program testing on the database of 20-NewsGroups and Reuters-21578.
出处 《计算机工程与应用》 CSCD 北大核心 2015年第23期131-138,225,共9页 Computer Engineering and Applications
关键词 迁移学习 欧式距离 余弦距离 保护型 归一化约束 过维数 transfer learning Euclidean distance cosine distance protection-type normalization constraints over dimension
  • 相关文献

参考文献15

  • 1Dai W,Xue G R,Yang Q,et al.Co-clustering based classification for out-of-domain documents[C]//Proceedings of the 13th SIGKDD,San Jose,California,2007.
  • 2Zhuang F,Luo P,Xiong H,et al.Exploiting associations between word clusters and document classes for crossdomain text categorization[J].Statistical Analysis and Data Mining,2011,4(1).
  • 3Ling X,Dai W,Xue G R,et al.Spectral domain-transfer learning[C]//Proceedings of the 14th ACM SIGKDD Conference on Knowledge Discovery in Data,Las Vegas,Nevada,USA,2008.
  • 4Dai W,Jin O,Xue G R,et al.Eigentransfer:a unified framework for transfer learning[C]//Proceedings of the26th ICML,Montreal,Quebec,Canada,2009,382.
  • 5Blitzer J,Mc Donald R,Pereira F.Domain adaptation with structural correspondence learning[C]//Proceedings of 2006 EMNLP,2006:120-128.
  • 6Pan S J,Ni X,Sun J T,et al.Cross-domain sentiment classification via spectral feature alignment[C]//International World Wide Web Conference,Raleigh,North Carolina,USA,2010.
  • 7Wang Z,Song Y,Zhang C.Knowledge transfer on hybrid graph[C]//Proceedings of the 21st IJCAI,2009.
  • 8Gao J,Fan W,Jiang J,et al.Knowledge transfer via multiple model local structure mapping[C]//International Conference on Knowledge Discovery and Data Mining,Las Vegas,Nevada,USA,2008.
  • 9Dai W,Yang Q,Xue G.Boosting for transfer learning[C]//Proceedings of the 24th International Conference on Machine Learning,Corvallis,OR,USA,2007:193-200.
  • 10Xue G,Dai W,Yang Q.Topic-bridged PLSA for crossdomain text classification[C]//Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.Singapore:ACM,2008:627-634.

同被引文献9

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部