摘要
采用距离度量模式的相似性(或不相似性)已广泛应用于模式识别和机器学习等领域.最常用的度量是欧氏距离和马氏距离(Mahalanobis distance).欧氏距离虽然计算相对简单,但由于存在无法结合先验知识、同等看待样本等局限性,常无法满足实际需要.解决此类问题的有效手段之一就是采用非欧氏度量,如马氏度量.马氏度量不仅能够结合数据的统计特性,还能兼顾样本间的相关性.讨论马氏距离度量的相关性质,并给予证明,主要包括:(1)两种度量的区别与联系;(2)在马氏距离度量下导出的点到平面(超平面)距离公式及投影公式;(3)两种度量是距离保持的.最后,给出相关实验验证.
Pattern similarities/dissimilarities based on distance matrices are widely used in pattern recognition and machine learning tasks.In most distance metrics,both Euclidean and Mahalanobis usually are paid more attention.The Euclidean can not meet the demands of some practical applications due to failing to utilize stuctural information of the given data and equivalently highlighting samples,though its computation is easier than non-Euclideans '.To overcome above shortcomings,one of effective strategies is to adopt non-Euclidean metric such as Mahalanobis.To Mahalanobis metric,it is capable of utilizing prior statistical characteristics of the given data points,likewise it also enable to combine the correlation between points in the process of computing the similarity of a pair of unknown points.In this paper,compared to Euclidean,several properties of Mahalannobis distance were proposed and mathematically proved.Concretely,the Euclidean is a special case of Mahalannobis when its covariance degenerates to an identity matrix.Furthermore,there must be an orthogonal matrix which able to transform Mahalannobis to its corresponding Euclidean metric.Similar to the Euclidean,this paper also introduce how to compute distance and projection(between a given data point and a plane/hyperplane)under the guidance of Mahalannobis metric.In the viewpoint of L2 norm,owing to vector and matrix norm theories,both distance metrics are consistency in nature,that is to say,both distance metrics are distance preserving.Finally,experiments and their correspondig analysis show foresaid properties on some toy problems.
出处
《南京大学学报(自然科学版)》
CAS
CSCD
北大核心
2013年第2期133-141,共9页
Journal of Nanjing University(Natural Science)
基金
国家自然科学基金(60903130)
江苏省自然科学基金(BK2012815)
南京林业大学人才基金(B2009-14)
关键词
欧氏距离
马氏距离
度量学习
相似性
Euclidean distance
Mahalanobis distance
metric learning
similarity