期刊文献+

基于数据分类的领域自适应新算法 被引量:1

A novel domain adaptation approach based on data classification
下载PDF
导出
摘要 一般的机器学习都假设训练数据与测试数据分布相同,而领域自适应算法则是在不同数据分布条件下进行知识传递和学习,在数据挖掘、数据校正、数据预测等领域有着广泛的应用。支持向量机SVM的主要思想是针对二分类问题,在高维空间寻找一个最优分类超平面,以保证最小的分类错误率。CCMEB理论由Tsang I提出的,是一种改进了核向量机CVM的最小包含球算法,在大样本数据集处理上有着较快的速度。而CCMEB理论同样适用于二分类的SVM数据集。将SVM理论、CCMEB理论与概率分布理论相结合,提出了一种全新的基于数据分类的领域自适应算法CCMEB-SVMDA,该算法通过计算各自分类数据组的包含球球心,能够有效地对不同领域数据进行整体校正和相似度识别,具有较好的便捷性和自适应性。在UCI数据、文本分类等数据上对该算法进行了验证,取得了较好的效果。 General machine learning assumes that the distribution of training data and test data are same,but the domain adaptation algorithms aims at handling different but similar distributions among training sets,which have a wide range of applications such as transfer learning,data mining,data correction,data projections.Support vector machine(SVM)attempts to find an optimal separating hyperplane for binary-classification problems in high-dimensional space,in order to ensure the minimum classification error rate.CCMEB proposed by I Tsang,as an improvement of the CVM,is particularly suitable for training on large datasets.In this article SVM and CCMEB are combined with probability distribution theory to formulate a novel domain adaptation approach(CCMEB-SVMDA).By calculating the center of each dataset,we can correct the dataset or identify the similarity of data between different domains.This fast algorithm has a good adaptability.As a validation we test it on the fields of'UCI data' and'text classification data'and the obtained experimental results indicate the effectiveness of the proposed algorithm.
作者 顾鑫 王士同
出处 《计算机工程与科学》 CSCD 北大核心 2014年第2期275-285,共11页 Computer Engineering & Science
基金 国家自然科学基金资助项目(61170122 60975027) 江苏省研究生创新工程项目(CXZZ11-0483)
关键词 支持向量机 领域自适应 最小包含球 中心约束型最小包含球 SVM domain adaptation minimum enclosing ball CCMEB
  • 相关文献

参考文献18

  • 1Daumé Ⅲ H,Marcu D.Domain adaptation for statistical classifiers[J].Journal of Artificial Intelligence Research,2006,26(1):101-126.
  • 2Blitzer J,McDonald R,Percira F.Domain adaptation with structural correspondence learning[C]//Proc of the 2006 Conference on Empirical Methods in Natural Language Processing,2006:120 128.
  • 3Daumé Ⅲ H.Frustratingly easy domain adaptation[C]//Proc of the 45th Annual Meetingassociation of Computation al Linguistics,2007:1.
  • 4Jiang Jin,Zhai Cheng-xiang.A two-stage approach to domain adaptation for statistical classifiers[C]//Proc of CIKM' 07,2007:401-410.
  • 5Blitzer J,Dredze M,Pereira F,et al.Biographies,bollywood,boom-boxes and blenders:Domain adaptation for sentiment classification[C]// Proc of ACL' 07,2007:440-447.
  • 6Satpal S,Sarawagi S.Domain adaptation of conditional probability models via feature subsetting[C]// Proc of PKDD' 07,2007:224-235.
  • 7Jiang Jin,Zhai Cheng-xiang.Instance weighting for domain adaptation in NLP[C]// Proc of ACL'07,2007:264-271.
  • 8Tsang I W,Kwok J T,Cheung P,et al.Core vector machines:Fast SVM training on very large data sets[J].Journal of Machine Learning Research,2005,6(4):363-392.
  • 9钱鹏江,王士同,邓赵红.大数据集快速均值漂移谱聚类算法[J].控制与决策,2010,25(9):1307-1312. 被引量:5
  • 10Tsang I,Kwork J,Zurada J.Generlized core vector machines[J].IEEE Transactions on Neural Networks,2006,17 (5):1126-1139.

二级参考文献57

  • 1李瑜,郑敏娟,程国建.神经网络方法拟合核磁共振测井曲线[J].微电子学与计算机,2009,26(2):66-68. 被引量:2
  • 2Ivor W Tsang, James T Kwok, Pak- Ming Cheung. Core vector machines: Fast SVM training on very large data sets [J]. Journal of Machine Learning Research, 2005(6) :363 - 392.
  • 3Lessmann, Stefan Li, Ning Voss. A case study of core vector machines in corporate data mining[ C]//Proceedings of the 41st Annual Hawaii International Conference on System Sciences. Hawaii, 2008:78 - 78.
  • 4Ivor W Tsang, Andras Kocsor James T Kwok. Simpler core vector machines with enclosing balls[ C]//Proceedings of the Twenty- Fourth International Conference on Machine Learning (ICML). Corvallis, Oregon, USA, 2007.
  • 5Asharaf S, Narasimha Murty M, Shevade S K. Multiclass core vector machine[ C] // Proceedings of the 24th international conference on Machine learning. Corvalis, Oregon, 2007:41 - 48.
  • 6Ivor W Tsang, James T Kwok, Jacek M Zurada. Generalizecl core vector machines[J ]. IEEE Transactions on Neural Networks, 2006,17(5) : 1126 - 1140.
  • 7Ozertem U, Erdogmus D, Jenssen R. Mean shift spectral clustering[J]. Pattern Recognition, 2008, 41: 1924-1938.
  • 8Heiler M, Keuchel J, Schnorr C. Semidefinite clustering for image segmentation with A-priori knowledge [C]. Proc of the 27th DAGM-Symposium. Vienna, 2005, 3663: 309- 317.
  • 9Shi J, Malik J. Normalized cuts and image segmentation[C]. Proc IEEE Conf on Computer Vision and Pattern Recognition. San Juan, 1997: 731-737.
  • 10Shi J, Malik J. Normalized cuts and image segmentation[J]. IEEE Trans on Pattern Analysis and Machine Intelligence, 2000, 22(8): 888-905.

共引文献11

同被引文献21

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部