摘要
有序回归是一种特殊的机器学习范式,其目标是利用类间内在的有序标号来划分模式。尽管已有众多有序学习方法相继被提出,但其性能常受制于有限的训练样本。借鉴最近提出的边际特征扰动思想,通过对训练样本的输入和输出分别施加已知分布噪声的随机扰动和确定偏差的可控扰动,以弥补样本有限的不足,进而在最小平方有序回归基础上发展出采用双重特征扰动的最小平方有序回归(least squares ordinal regression using doubly corrupted features,LSOR-DCF)。实验结果表明,LSOR-DCF性能优于无扰动或单一输入/输出的扰动,且在小数据集上表现得尤其明显。
Ordinal regression is a special machine learning paradigm whose purpose is to classify patterns by using between-class natural ordinal scale. Many ordinal regression algorithms have been proposed. However, their perfor-mance will largely be constrained when facing the dataset with the limited size. To remedy the shortcoming of finite dataset, inspired by recently-proposed marginalized corrupted features, this paper develops the least squares ordinal regression using doubly corrupted features (LSOR-DCF) which is based on least squares ordinal regression by cor-rupting both the samples using random noise from known distributions and the labels using deterministic biases. The experimental results demonstrate the superiority of LSOR-DCF in performance, especially in the small data sets, to related methods without adding either noise in samples or corrupted noise in samples and labels alone.
出处
《计算机科学与探索》
CSCD
2014年第9期1085-1091,共7页
Journal of Frontiers of Computer Science and Technology
基金
高等学校博士学科点专项科研基金
江苏省自然科学基金~~
关键词
有序回归
最小平方回归
边际特征扰动
双重扰动
ordinal regression
least squares regression
marginalized corrupted features
double corruption