期刊文献+

一种基于交叉验证思想的半监督分类方法 被引量:9

A Semi-supervised Classification Algorithm Based on the Idea of Cross Validation
下载PDF
导出
摘要 为了提高半监督分类的有效性,提出一种基于交叉验证思想的半监督分类方法(CV-S3VM)。通过对未标记样本进行伪标记,将伪标记后的样本加入到标记样本集中,参与交叉验证,选取能使SVM分类器误差最小的标记作为最终的标记,实现对未标记样本进行标记。依次挖掘未标记样本的隐含信息,增加标记样本的数目。使用UCI数据集模拟半监督分类实验环境,结果表明CV-S3VM具有较高的分类率,在标记样本较少的情况下效果更为明显。 In order to improve the performance of semi - supervised classifier, a kind of semi - supervisedclassification algorithm CV - S3VM based on the idea of cross validation was proposed. Unlabeled sampleswere labeled and added to the labeled sample set to participate in cross validation. The labels which makeSVM classifier error minimum were selected as the final lables to mark the unlabeled samples. In this waythe information embedded in the unlabeled samples were mined and the number of labeled samples wasexpanded. Finally, the UCI dataset was used to simulate the semi -supervised classification experimentalenvironment. The results show that CV - S3VM has a higher classification rate. In the case of few labeledsamples, the effect is more obvious.
作者 赵建华
出处 《西南科技大学学报》 CAS 2014年第1期34-38,48,共6页 Journal of Southwest University of Science and Technology
基金 陕西省教育厅科研计划项目资助(12JK0748)
关键词 机器学习 半监督分类 交叉验证 支持向量机 Machine learning Semi - supervised classification Cross validation Support vector machine
  • 相关文献

参考文献18

  • 1吴伟宁,刘扬,郭茂祖,刘晓燕.基于采样策略的主动学习算法研究进展[J].计算机研究与发展,2012,49(6):1162-1173. 被引量:33
  • 2ZHU X J. Semi -supervised Learning Literature Survey [ R]. Madison : University of Wisconsin, 2008.
  • 3李昆仑,曹铮,曹丽苹,张超,刘明.半监督聚类的若干新进展[J].模式识别与人工智能,2009,22(5):735-742. 被引量:50
  • 4CH APELLE O, ZIEN A. Semi -supervised Classifica- tion by Low Density Separation [ C ]. Proceedings of the 10th International Workshop on Artificial Intelligence and Statistics, Barbados, 2005. 57 -64.
  • 5ZHOU Z H , LI M. Tri -training: exploiting unlabeled data using three classifiers [ J ] . IEEE Transactions on Knowl- edge and Data Engineering , 2005, 17(11) :1529-1542.
  • 6Zhang M L, ZHOU Z H. CoTrade: Confident co -train- ing with data editing[J]. IEEE Transactions on Systems, Man, and Cybernetics - Part B: Cybernetics, 2011, 41 (6) : 1612 - 1626.
  • 7赵建华,李伟华.一种协同半监督分类算法Co-S3OM[J].计算机应用研究,2013,30(11):3237-3239. 被引量:12
  • 8WANG Yun - yun, CHEN Song - cai, ZHOU Zhi - hua. New semi - supervised classification method based on modified cluster assumption [ J ]. IEEE Transactions on Neural Networks and Learning Systems, 2012, 23 (5): 689 - 702.
  • 9LI Y F, KWOK J T, ZHOU Z H. Cost - Sensitive Semi - supervised Support Vector Machine [ A ]. In : Proceed- ings of the 24th AAAI Conference on Artificial Intelli- gences (AAAI10) [ C]. Atlanta, GE, 2010, 500 - 505.
  • 10MENG Jun, WU Li - xia, WANG Xiu - kun. Granulation -based symbolic representation of time series and semi -supervised classification [ J ]. Computers and Mathe- matics with Applications, 2011, 62 (9) : 3581 - 3590.

二级参考文献188

  • 1Olivier C, Bernhard S, Alexander Z. Semi-Supervised Learning. Cambridge, USA : MIT Press, 2006 : 3 - 10.
  • 2Blum A, Mitchell T. Combining Labeled and Unlabeled Data with Co-Training//Proe of the 11th Annual Conference on Computational Learning Theory. Madison, USA, 1998 : 92 - 100.
  • 3Zhong Shi. Semi-Supervised Model-Based Document Clustering: A Comparative Study. Machine Learning, 2006, 65 ( 1 ) : 3 - 29.
  • 4Wagstaff K, Cardie C, Rogers S, et al. Constrained K-means Clustering with Background Knowledge // Proc of 18th International Conference on Machine Learning. San Francisco, USA, 2001:577 -584.
  • 5Wagstaff K, Cardie C. Clustering with Instance-Level Constraints// Proc of the 17th International Conference on Machine Learning. SanFrancisco, USA, 2000:1103 - 1110.
  • 6Huang Desheng, Pan Wei. Incorporating Biological Knowledge into Distance-Based Clustering Analysis of Micro Array Gene Expression Data. Bioinformatics, 2006, 22 (10) : 1259 - 1268.
  • 7Tari L, Baral C, Kim S. Fuzzy C-Means Clustering with Prior Biological Knowledge. Journal of Biomedical Informatics, 2009, 42 (1): 74-81.
  • 8Ceccarelli M, Maratea A. Improving Fuzzy Clustering of Biological Data by Metric Learning with Side Information. International Journal of Approximate Reasoning, 2008, 47 ( 1 ) : 45 - 57.
  • 9Huang Ruizhang, Lam W. An Active Learning Framework for Semi Supervised Document Clustering with Language Modeling. Data & Knowledge Engineering, 2008, 68 ( 1 ) : 49 - 67.
  • 10Erman J, Mahanti A, Arlitt M, et al. Offline/Realtime Traffic Classification Using Semi-Supervised Learning. Performance Evaluation, 2007, 64(9/10/11/12): 1194- 1213.

共引文献183

同被引文献92

引证文献9

二级引证文献20

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部