摘要
分类是数据挖掘和数据分析中最有应用价值的技术之一。传统的积极学习方法需要预先对模型空间进行假设,并且没有充分考虑到实例之间的相关性,其泛化能力将会受到一定程度的影响。针对上述问题,提出了一种基于新型映射关系的局部加权回归方法 MLWR。该算法首先找出测试样本在训练集中的近邻样本,然后建立测试样本和近邻样本的回归函数,根据建立的回归模型和近邻样本的标签,计算得到测试样本的标签。实验与当前流行的多种分类方法在UCI的9个数据集上进行测试。实验结果表明我们的方法能有效地提高分类精度,对较大样本数据也有较好的适用性。
Classification is one of the most practical techniques in data mining and analysis. Existing classification algorithms based on eager learning require a model assumption and do not address the correlations between individual instances, such that their performance can be affected. In this paper, we propose a new learning method based on the locally weighted regression, called MLWR. For a given test example, the MLWR firstly identifies the neighboring instances in the training set, and a locally weighted regression model is generated from the test instance and its neighboring instances. Then the test label is calculated by using the regression model and the neighboring labels. In the experiments, five classification methods are tested on 9 data sets of UCI. Experiment results show that the performance of the MLWR is superior to other methods and also suitable for big data.
出处
《计算机工程与科学》
CSCD
北大核心
2015年第10期1959-1964,共6页
Computer Engineering & Science
基金
浙江省教育厅资助项目(Y201328291)
浙江省自然科学基金资助项目(LZ14F030001
LY14F020012)
关键词
分类
映射关系
局部加权回归
K-NN
懒惰学习
classification
mapping relationship
locally weighted regression
k-NN
lazy learning