摘要
度量学习通过更真实的刻画样本之间的距离,来提高分类和聚类的精度。GMML(Geometric Mean MetricLearning)在学习度量矩阵A 时,使得在该度量下同类点之间的距离尽可能小,不同类点之间的距离尽可能大。GMML用来学习的训练样本均为目标类数据,而对于现实存在的为数众多的同领域非目标类数据,即Universum数据并未加以利用,不免造成信息的浪费,针对此,提出一种新的度量学习算法——融入Universum学习的GMML(UGMML)。U-GMML期望得到一个新的度量矩阵A ,使得同类点之间的距离尽可能小,不同类点之间的距离尽可能大,且Universum数据与目标类数据的距离尽可能大,从而使得所学习的度量矩阵A 更有利于分类。真实数据集上的实验结果验证了该算法的有效性。
Metric learning improves the accuracy of classification and clustering by reflecting the distance between samples more realistically. GMML(Geometric Mean Metric Learning)aims to learn metric matrix A to make the distance within the same class as small as possible and the distance between different classes as large as possible. The training samples used by GMML are target data, but for the large number of real-world non-target class data, which called Universum data are ignored, result in the waste of domain information hidden in Universum data. This paper proposes a new metric learning algorithm U-GMML based on Universum learning. U-GMML aims to learn a new metric matrix A to ensure the distance within the same class samples as small as possible, the distance between different class samples as large as possible, and the distance between the Universum data and the target class data as large as possible. The metric matrix A which learned by U-GMML is more efficiency for classification. The experimental results on the real dataset verify the effectiveness of the algorithm.
作者
刘鸿
陈晓红
张恩豪
LIU Hong;CHEN Xiaohong;ZHANG Enhao(College of Science,Nanjing University of Aeronautics and Astronautics,Nanjing 211106,China)
出处
《计算机工程与应用》
CSCD
北大核心
2019年第13期158-164,238,共8页
Computer Engineering and Applications