期刊文献+

高维数据中有效的相似性计算方法 被引量:4

EFFICIENT SIMILARITY COMPUTING METHODS IN HIGH DIMENSIONAL DATA
下载PDF
导出
摘要 相似性的计算是 CBR和 k- NN等 L azy L earning研究中十分关键的问题 .研究了降低相似性计算代价的方法 ,并以 k- NN为例 ,介绍了基于部分特征的相似性算法和基于投影的相似性算法 ,它们能够通过减少计算距离过程中所涉及的特征数目来提高算法的效率 .实验表明效率的提高是明显的 ,其中基于部分特征的 k- NN算法效率提高 2 6%~ 2 8% ,基于投影的 k- NN算法效率提高 48%~ 83% .作者已将该算法应用到工程中 . Similarity is a pivotal notion in research on lazy learning, such as case based reasoning and k NN (nearest neighbor). A method of how to decrease complexity of computing similarity is studied, and a similarity calculation algorithm is introduced, that is based on partial features and the similarity calculation algorithm that is based on projection. For briefness and clarity, they are described in the procedure of k NN: partial feature based k NN algorithm and projection based k NN algorithm. In the steps of acquiring distance, using only few features can improve efficiency. This improvement is remarkable in our experiment: the former increases about 26%~28%, and the latter increases from 48% to 83%. At the same time, those algorithms have been adapted in application successfully.
出处 《计算机研究与发展》 EI CSCD 北大核心 2000年第10期1166-1172,共7页 Journal of Computer Research and Development
基金 国家自然科学基金!(项目编号 6 980 3 0 10 ) 国家"八六三"高技术研究发展计划基金资助!(项目编号 86 3 -5 11-946 86 3 -818-0 7)
关键词 相似性 计算方法 高维数据 数据采掘 数据库 similarity, data reduction, nearest neighbor, lazy learning, data mining, KDD
  • 相关文献

参考文献2

  • 1Aggarwal C C,Proc 3rd Pacific-Asia Conf onKnowledge Discovery in Database,PAKDD-99,1999年,13页
  • 2史忠植,高级人工智能,1998年

同被引文献16

  • 1杜利民,谢凌云,刘斌.HMM非特定人连续语音识别的嵌入式实现[J].电子与信息学报,2005,27(1):60-63. 被引量:6
  • 2姚天任.数字语音处理[M].武汉:华中科技大学出版社.2003.
  • 3Rabiner L, Juang B H. Fundamentals of speech recognition[M]. Washington: Prentice Hall, 1993.
  • 4Huang X, Acero A, Hon H. Spoken language processing: a guide to theory, algorithm and system development[M]. 1st Edition. Washington:Prentiee Hall, 2001.
  • 5Bocchieri E. Vector quantization for the efficient computation of continuous density likelihoods[C] // Proceedings of International Conference on Acoustics, Speech and Signal Processing (JCASSP). Minneapolis: [s.n.], 1993, 2: 692-695.
  • 6Pellom B L, Sarikaya R, Hansen J H L. Fast likelihood computation techniques in nearest-neighbor based search for continuous speech recognition[J]. Signal Preessing Letters, 2001, 8(8): 221-224.
  • 7Lee A, Kawahara T, Shikano K. Gaussian mixture selection using context-independent HMM[C] // Proceedings of International Conference on Acoustics,Speech and Signal Processing (ICASSP). Salt Lake City:[s.n.], 2001, 1: 69-72.
  • 8Fritsch J, Rogina I. The bucket box intersection (BBI) algorithm for fast approximative evaluation of diagonal mixture gaussians[C]// Proceedings of International Conference on Acoustics, Speech and Signal Processing(ICASSP). [s. n. ], 1996: 837-840.
  • 9Bocchieri E, Mak B K W. Subspace distribution clustering hidden markov model [ J]. IEEE Trans on Speech and Audio Processing, 2001, 9(3) : 264-275.
  • 10国家电网公司.油浸式变压器(电抗器)状态检修导则[S].北京:中国电力出版社,2008.

引证文献4

二级引证文献13

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部