期刊文献+

马氏度量学习中的几个关键问题研究及几何解释 被引量:17

Research on several key problems of Mahalanobis metric learning and corresponding geometrical interpretaions
下载PDF
导出
摘要 采用距离度量模式的相似性(或不相似性)已广泛应用于模式识别和机器学习等领域.最常用的度量是欧氏距离和马氏距离(Mahalanobis distance).欧氏距离虽然计算相对简单,但由于存在无法结合先验知识、同等看待样本等局限性,常无法满足实际需要.解决此类问题的有效手段之一就是采用非欧氏度量,如马氏度量.马氏度量不仅能够结合数据的统计特性,还能兼顾样本间的相关性.讨论马氏距离度量的相关性质,并给予证明,主要包括:(1)两种度量的区别与联系;(2)在马氏距离度量下导出的点到平面(超平面)距离公式及投影公式;(3)两种度量是距离保持的.最后,给出相关实验验证. Pattern similarities/dissimilarities based on distance matrices are widely used in pattern recognition and machine learning tasks.In most distance metrics,both Euclidean and Mahalanobis usually are paid more attention.The Euclidean can not meet the demands of some practical applications due to failing to utilize stuctural information of the given data and equivalently highlighting samples,though its computation is easier than non-Euclideans '.To overcome above shortcomings,one of effective strategies is to adopt non-Euclidean metric such as Mahalanobis.To Mahalanobis metric,it is capable of utilizing prior statistical characteristics of the given data points,likewise it also enable to combine the correlation between points in the process of computing the similarity of a pair of unknown points.In this paper,compared to Euclidean,several properties of Mahalannobis distance were proposed and mathematically proved.Concretely,the Euclidean is a special case of Mahalannobis when its covariance degenerates to an identity matrix.Furthermore,there must be an orthogonal matrix which able to transform Mahalannobis to its corresponding Euclidean metric.Similar to the Euclidean,this paper also introduce how to compute distance and projection(between a given data point and a plane/hyperplane)under the guidance of Mahalannobis metric.In the viewpoint of L2 norm,owing to vector and matrix norm theories,both distance metrics are consistency in nature,that is to say,both distance metrics are distance preserving.Finally,experiments and their correspondig analysis show foresaid properties on some toy problems.
出处 《南京大学学报(自然科学版)》 CAS CSCD 北大核心 2013年第2期133-141,共9页 Journal of Nanjing University(Natural Science)
基金 国家自然科学基金(60903130) 江苏省自然科学基金(BK2012815) 南京林业大学人才基金(B2009-14)
关键词 欧氏距离 马氏距离 度量学习 相似性 Euclidean distance Mahalanobis distance metric learning similarity
  • 相关文献

参考文献18

  • 1Xing E P,Ng A Y,Jordan M I. Distance metric learning with application to clustering with side-information[J].Advances in Neural Information Process ing Systems,2002.521-528.
  • 2Weinberger K Q,Saul L K. Distance metric learning for large margin nearest neighbor classification[J].Journal of Machine Learning Research,2009.207-244.
  • 3郭铭铭,窦建华,杨彬.基于形式化概念分析和概念相似性度量的程序重组方法(英文)[J].南京大学学报(自然科学版),2011,47(5):594-604. 被引量:2
  • 4Cover T,Hart P. Nearest neighbor pattern classification[J].IEEE Transactions on Information theory,1967.21-27.
  • 5Chopra S,Hadsell R,LeCun Y. Learning a similiarty metric discriminatively,with application to face verification[A].San Diego,California,USA,2005.539-546.
  • 6Domeniconi C,Gunopulos D,Peng J. Large margin nearest neighbor classifiers[J].IEEE Transactions on Neural Networks,2005,(04):899-909.doi:10.1109/TNN.2005.849821.
  • 7Goldberger J,Roweis S,Hinton G. Neighbourhood components analysis[A].Cambridge:MIT Press,MA,2005.513-520.
  • 8Huang K,Yang H,King I. Learning large margin classifiers locally and globally[A].Banff,Canada,2004.260-272.
  • 9Yeung D S,Wang D,Ng W W Y. Structured large margin machines=Sensitive to data distribu tions[J].Machine Learning,2007.171-200.
  • 10Lanckriet G R G,Ghaoui L E,Bhattacharyya C. A robust minimax approach to classification[J].Journal of Machine Learning Research,2002.555-582.

二级参考文献31

  • 1杨绪兵,陈松灿.基于原型超平面的多类最接近支持向量机[J].计算机研究与发展,2006,43(10):1700-1705. 被引量:16
  • 2Ganter B, Wille R. Formal concept analysis: Mathematical foundations. Berlin Heidelberg: Springer, 1999,12-14.
  • 3Wille R. Restructuring lattice theory: An approach based on hierarchies of concepts. Rival 1. Ordered Sets. Dordreeht Boston: Reidel, 1982, 445-470.
  • 4Thomas T, Richard C, Peter B, et al. A survey of formal concept analysis support for software engineering activities. Lecture Notes in Computer Science. Springer-Verlag, 2005, 3626: 205-271.
  • 5Robert W B. Refactoring gec using structure field access traces and concept analysis. ACM SIGSOFT Software Engineering Notes, 2005,30(4):1-7.
  • 6Eisenbarth T, Koschke R. Locating features in source code. IEEE Transactions on Software Engineering, 2003, 29(3) : 195-209.
  • 7Martin P R, Gail C M. Representing concerns in source code. ACM Transactions on Software Engineering and Methodology, 2007, 16 (1) : 25-32.
  • 8Al-Ekram R, Kontogiannis K. Source code modularization using lattice of concept slices. Proceedings of the 8^th European Conferences on Software Maintenance and Reengineering (CSMR'O4), Finland, IEEE Computer Society, 2004, 195-203.
  • 9Lung C H, Xu X, Zaman M, et al. Program restructuring using clustering techniques. Journal of Systems and Software, 2006, 79(9): 1261-1279.
  • 10I.ung C H, Zaman M, Nandi A. Applications of clustering techniques to software partitioning, recovery and restructuring. Journal of Systems and Software, 2004, 73(2):227-244.

共引文献10

同被引文献121

引证文献17

二级引证文献83

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部