摘要
针对SVM分类过程中,处理大规模训练样本集遇到的因样本维度高、消耗大量内存导致分类效率低下的问题,提出基于网格环境的计算策略。该策略针对密集型计算问题分别提出按步骤、按功能、按数据进行任务分解的三种解决方案,用户根据SVM样本训练和分类的实际来选择使用哪一种方案。对遥感数据分别在单机环境和网格环境的对比实验表明,能够提高训练和分类速度,在计算环境的层面弥补处理大规模数据对计算性能的高要求。
Typical computational problems, such as consuming large amounts of memory due to high sam- ple dimensions during large-scale training sets SVM classification, are overcome with the strate- gy of reducing large-scale SVM training samples based on grid computing. Based on this strate- gy, puts forward three kinds of solution according to step, function or data to decompose the task, users can choose a suitable one to fit their own need. The experiment dealing with remote sensing data in single machine and grid environment shows that it improves the speeds of train- ing and classification, and makes up for deficiency of dealing with mass data in the level of computing environment.
出处
《现代计算机》
2010年第7期16-19,23,共5页
Modern Computer
基金
广东省自然科学基金(No.8452800001001086)
关键词
SVM
网格计算
分类
大规模数据集
SVM
Grid Computing
Classification
Large Scale Data Sets