摘要
现有一类分类算法通常采用经典欧氏测度描述样本间相似关系,然而欧氏测度不能较好地反映一些数据集样本的内在分布结构,从而影响这些方法对数据的描述能力.提出一种用于改善一类分类器描述性能的高维空间一类数据距离测度学习算法,与已有距离测度学习算法相比,该算法只需提供目标类数据,通过引入样本先验分布正则化项和L1范数惩罚的距离测度稀疏性约束,能有效解决高维空间小样本情况下的一类数据距离测度学习问题,并通过采用分块协调下降算法高效的解决距离测度学习的优化问题.学习的距离测度能容易的嵌入到一类分类器中,仿真实验结果表明采用学习的距离测度能有效改善一类分类器的描述性能,特别能够改善SVDD的描述能力,从而使得一类分类器具有更强的推广能力.
Most one-class classification algorithms measure similarity based on euclidean distance between samples. Unfortunately, the Euclidean distance can not well reveal the internal distribution of some datasets, and reduces the descriptive ability of these methods. A distance metric learning algorithm in high-dimensional space is proposed to improve the performance of one-class classifiers in this paper. Compared with existing distance metric learning algorithm, the algorithm only needs to provide target class data, it can effectively solve distance metric learning problem for one-class samples in case of small sample size and in high-dimensional space by imposing sample distribution prior and sparsity prior with l1-norm constraint on the distance metric, and the formulation can be efficiently optimized in a block coordination descent algorithm. The learned metric can be easily embedded into one-class classifiers, the simulation experimental results show that the learned metric can effectively improve the description performance of one-class classifiers, in particular the description of SVDD, and it makes a stronger generalization ability of one-class classifiers.
出处
《数学的实践与认识》
CSCD
北大核心
2011年第6期116-124,共9页
Mathematics in Practice and Theory
基金
河北省自然科学基金(F2008000891)
河北省自然科学基金(F2010001297)
中国博士后自然科学基金(20080440124)
第二批中国博士后基金特别资助(200902356)
国家自然科学基金(61071199)
关键词
高维空间
稀疏距离测度学习
L1范数
一类分类器
high-dimensional space
sparse distance metric learning
/l-norm
one-class classifter