摘要
为了提高文本分类精度,根据训练集的样本密度的不同,提出了一种基于k最近邻密度估计的样本加权算法,从而使得样本密度较大的样本权重得到加强,处于样本密度平均水平的样本权重保持不变,而样本密度较小的样本权重得到减弱。并将这种方法所构成的神经网络分类器用于文本分类。实验结果表明,这种方法可以在一定程度上提高文本分类精度,优于原始的未加权的分类方法。
A sample-weighted algorithm based on k nearest density estimation is proposed according to different density of training samples in order to improve the precision of the text classification. Thns the weight of sample with higher density is strengthened, the weight of sample with mean density is kept unchanged, and the weight of sample with less than mean density is weakened. The NN classifier formed in this method is applied in text classification. The experiment results show that this method can improve the precision of text classification to some degree. And the weighted classifier is better than the traditional classifier.
出处
《计算机应用与软件》
CSCD
2009年第9期234-236,239,共4页
Computer Applications and Software
关键词
k最近邻密度估计
神经网络
文本分类
k nearest density estimation Neural network (NN) Text classification