期刊文献+

一种聚簇消减大规模数据的支持向量分类算法 被引量:10

Cluster Method of Support Vector Machine to Solve Large-scale Data Set Classification
下载PDF
导出
摘要 针对支持向量分类机对大规模数据集训练速度慢的瓶颈,提出一种聚簇消减数据集方法。首先建立样本中心距离函数,计算聚簇集的比例半径,然后利用聚簇集镜像扫描样本点确定簇集类,同一类样本特性的聚簇集中只保留代表样本点,建立异类点删除矩阵,通过上述方法消减样本集。证明了这种簇消减算法有较低的时间复杂度,并利用实验说明了保留代表点的有效意义。最后通过随机数据和UCI标准数据库验证了算法在保证分类精度的同时提高了分类速度。 A cluster Support Vector Machines (C-SVM) method for large-scale data set classification was presented to accelerate speed. Firstly, using function of centre distance calculated radius ratio. Then, data set was scanned by cluster mirror. By remaining representative data for cluster and installing deleted matrix sample set was remarkably reduced. It is proved that the new method has lower time complexity. Experiments with random data and UCI databases verify the efficiency of the C-SVM. Moreover, classification accuracy is gained at adjustment threshold value.
出处 《计算机科学》 CSCD 北大核心 2009年第3期184-188,共5页 Computer Science
基金 国家自然科学基金(编号:10501009和10661005) 桂电软环境项目和安徽财经大学青年基金资助
关键词 支持向量机 聚簇集 大规模数据集 训练速度 SVM, Cluster, Large-scale data set, Training speed
  • 相关文献

参考文献8

二级参考文献68

  • 1胡懋智,古红英.各种不同类型的支持向量机及其性能比较分析[J].计算机工程与应用,2005,41(12):37-40. 被引量:8
  • 2白亮,老松杨,胡艳丽.支持向量机训练算法比较研究[J].计算机工程与应用,2005,41(17):79-81. 被引量:15
  • 3陆波,尉询楷,毕笃彦.支持向量机在分类中的应用[J].中国图象图形学报,2005,10(8):1029-1035. 被引量:23
  • 4CristianiniN Shawe-TaylorJ 李国正译.支持向量机导论[M].北京:电子工业出版社,2004..
  • 5Hearst M A, Dumais S T, Osman E, Platt J, Scholkopf B.Support Vector Machines. IEEE Intelligent Systems, 1998, 13(4) : 18-28.
  • 6Ke Hai-Xin,Zhang Xue-Gong. Editing support vector machines.In: Proceedings of International Joint Conference on Neural Networks, Washington, USA, 2001, 2:1464-1467.
  • 7Vapnik V N. An overview of statistical learning theory. IEEE Transactions on Neural Networks, 1999, 10 (5): 988-999.
  • 8Vapnik V N. Statistical Learning Theory. 2nd ed. New York:Springer-Verlag : 1999.
  • 9Klaus-Robert Mailer, Sebastian Mika, Gunnar Raetsch, Koji Tsuda, and Bernhard Schoelkopf. An introduction to kernel-based learning algorithms. IEEE Transactions on Neural Networks, 2001, 12 (2): 181-201.
  • 10Burges C J C. A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, 1998, 2(2): 121-167.

共引文献139

同被引文献74

  • 1李红莲,王春花,袁保宗,朱占辉.针对大规模训练集的支持向量机的学习策略[J].计算机学报,2004,27(5):715-719. 被引量:53
  • 2钱晓东,王正欧.基于改进KNN的文本分类方法[J].情报科学,2005,23(4):550-554. 被引量:19
  • 3王煜,徐建民.基于RBF神经网络和决策树的文本分类方法[J].计算机工程与应用,2005,41(14):175-178. 被引量:4
  • 4王华忠,俞金寿.核函数方法及其模型选择[J].江南大学学报(自然科学版),2006,5(4):500-504. 被引量:40
  • 5T. W. Hsieh, J. S. Taur , S. Y. Kung .A KNN-Scoring Based Core-Growing Approach to Cluster Analysis[J]. Journal of Signal Processing Systems, 2009,(10): 1939-8018.
  • 6KIM C J, HWANG K B. Naive Bayes classier learning with feature selection for spam detection, in social bookmarking [ C ]//Lecture Notes in Computer Science. Berlin: Springer-Verlag, 2008.
  • 7LIU Xiao-zhang, FENG Guo-can. Kernel bisecting K-means cluste- ring for SVM training sample reduction[ C]//Proc of the 19th Interna- tional Conference on Pattern Recognition. 2008:1-4.
  • 8XU Yan-zi, QIN Hua. A new optimazation method of large-scale SVMs based on kernel distance clustering[ C]//Proc of International Computational Intelligence and Software Engineering. 2009:1-4.
  • 9HOTHO A, JASCHKE R, SCHMITZ C, et al. Emergent semantics in bibSonomy [ M ]. Liskowsky : GI Jahrestagung, 2006 : 305- 312.
  • 10MADKOUR A, HEFNI T, HEFN Y A, et al. Using semantic features to detect spamming in social bookmarking systems [ C ]//Lecture Notes in Computer Science. Berlin: Springer-Verlag , 2008.

引证文献10

二级引证文献83

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部