摘要
高维空间中数据之间的相似性度量是目前数据挖掘、信息处理与检索等领域所面临的一个重要问题.文章在总结分析了高维数据的特点以及现有的一些度量方法的基础上,提出了一种新的度量方式,该方法在对高维数据进行相似性度量之前,首先对原始数据空间进行网格划分.文章的最后对其有效性作了定量分析,实验证明,该方式是行之有效的.
Similarity measurement of data in high-dimensional spaces is an important problem in most current research domains such as data mining, information processing & searching, etc.. After the summarization and analysis of the characteristics of high-dimensional data and existing typical measurement methods, this paper proposes a new measurement approach based on a special grid splitting strategy. In order to illustrate the efficiency of the proposed method in high-dimensional spaces, a quantitative analysis is given in the paper. Experiment indicates that this method is efficacious,
出处
《数学的实践与认识》
CSCD
北大核心
2006年第9期189-194,共6页
Mathematics in Practice and Theory
基金
国家自然科学基金(60473117)
关键词
维度灾难
相似度量
距离度量
curse of dimensionality
similarity measurement
distance measurement