期刊文献+

基于免参数据挖掘的相异度度量研究

Dissimilarity measure based on parameter-free data mining
下载PDF
导出
摘要 由于参数设置导致数据挖掘结果异常的例子很多,为了解决这一问题,出现了免参数据挖掘思想。对Kolmogorov复杂度理论进行了研究,将其和免参数据挖掘思想相结合,提出了一种基于压缩的相异度度量SCDM。由于压缩算法是空间和时间高效性算法,使得应用该算法的相异度度量也具有较好的性能。实验表明将这种相异度度量应用到层次聚类算法中,其聚类的准确率也较高。 There are many cases of getting fault results in data mining because of setting parameters. Therefore, parameter-free data mining appeared in order to resolve the problem. A new Compression-Based Dissimilarity Measure which was based on Kolmogorov complexity theory was proposed. Because compression algorithms are typically space and time efficient, the method based on it also has good performance. Applying it to hierarchical clustering can get good result.
出处 《计算机应用》 CSCD 北大核心 2006年第12期2982-2984,共3页 journal of Computer Applications
基金 河南省自然科学基金资助项目(0211050110)
关键词 免参数据挖掘 Kolmogorov复杂度 压缩算法 相异度度量 层次聚类 parameter-free data mining Kolmogorov complexity compression algorithm dissimilarity measure hierarchical clustering
  • 相关文献

参考文献12

  • 1SAIZBERG SL.On comparing classifiers:Pitfalls to avoid and a recommended approach[J].Data Mining and Knowledge Discovery,1997,1(3):317-328.
  • 2KEOGH E,LIN J,TRUPPLE W.Clustering of Time Series Subsequences is Meaningless:Implications for Past and Future Research[A].Proceedings of the 3rd IEEE International Conference on Data Mining[C].Melbourne,FL,2003.115-122.
  • 3ELKAN C.Magical thinking in data mining:lessons from CoIL challenge 2000[D].Department of Computer Science and Engineering,University of California,San Diego,USA,2001.
  • 4KENNEL MB.Testing time symmetry in time series using data compression dictionaries[D].Institute For Nonlinear Science,University of California,San Diego,La Jolla,California 92093-0402,USA,2004.
  • 5RATANAMAHATANA CA,KEOGH E.Making Time-series Classification More Accurate Using Learned Constraints[A].Proceedings of SIAM International Conference on Data Mining[C].Lake Buena Vista,Florida,2004.
  • 6VLACHOS M,HADIELEFRHERIOU M,GUNOPULOS D,et al.Indexing Multi-Dimensional Time-Series with Support for Multiple Distance Measures[A].The 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining[C].Washington,2003.216-225
  • 7LI M,BADGER JH,CHEN X,et al.An information-based sequence distance and its application to whole mitochondrinl genome phylogeny[J].Bioinformatics,2001,17(2):149-154.
  • 8LI M,VITANYI P.An Introduction to Kolmogorov Complexity and Its Applications[M].Second Edition.Springer Verlag,1997.
  • 9GAMMERMAN A,VOVK V.Kolmogorov Complexity:Sources,Theory and Applications[J].The Computer Journal,1999,42(4):252 -255.
  • 10CHEN X,KWONG S,LI M.A compression algorithm for DNA sequences and its applications in genome comparison[A].Proceedings of RECOMB 2000[C].2000.

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部