摘要
线性判别分析是一种统计学习方法.针对线性判别分析的小样本奇异性问题和对污染样本敏感性问题,目前许多线性判别分析的改进算法已被提出.本文提出了基于Kullback-Leibler(KL)散度不确定集的判别分析方法.提出的方法不仅利用了Ls范数定义类间距离和Lr范数定义类内距离,而且对类内样本和各类中心的信息进行基于KL散度不确定集的概率建模.首先通过优先考虑不利区分的样本提出了一种正则化对抗判别分析模型并利用广义Dinkelbach算法求解此模型.这种算法的一个优点是在适当的条件下优化子问题不需要取得精确解.投影(次)梯度法被用来求解优化子问题.此外,也提出了正则化乐观判别分析并采用交替优化技术求解广义Dinkelbach算法的优化子问题.许多数据集上的实验表明了本文的模型优于现有的一些模型,特别是在污染的数据集上,正则化乐观判别分析由于优先考虑了类中心附近的样本点,从而表现出良好的性能.
Linear discriminant analysis is a statistical learning method.For the singularity problem of small samples and the sensitivity to contaminated samples,now many improved algorithms of linear discriminant analysis have been proposed.In this paper we propose discriminant analysis methods via uncertainty sets from the Kullback-Leibler(KL)divergence.The proposed methods not only employ the Ls norm to define the distance between classes and the Lr norm to define the distance within classes,but also implement the probability modeling for within-class samples and class means based on uncertainty sets from the KL divergence.This paper first proposes a regularized adversarial discriminant analysis model by placing more emphasis on the samples that are difficult to be separated and then the generalized Dinkelbach's algorithm is used to solve the proposed optimization model.One advantage of this method is that the optimization subproblems do not need to be solved precisely under proper conditions.In addition,this paper also proposes regularized optimistic discriminant analysis and uses the alternative optimization technique to solve optimization subproblems in the generalized Dinkelbach's algorithm.Experiments on many data sets show that the proposed models are superior to some existing models.Especially on the contaminated data sets regularized optimistic discriminant analysis produces better performance since it places more emphasis on the samples which lie around class means.
作者
梁志贞
张磊
LIANG Zhi-Zhen;ZHANG Lei(School of Computer Science and Technology,China University of Mining and Technology,Xuzhou 221116;Digitization of Mine,Engineering Research Center of Ministry of Education,Xuzhou 221116)
出处
《自动化学报》
EI
CAS
CSCD
北大核心
2022年第4期1033-1047,共15页
Acta Automatica Sinica
基金
国家自然科学基金(61976216)资助。
关键词
判别分析
KL散度
不确定集
正则化
数据分类
Discriminant analysis
KL divergence
uncertainty sets
regularization
data classification