期刊文献+

基于DSM的知识约简方法研究 被引量:1

Data Reduction Based on DSM
下载PDF
导出
摘要 根据对象属性的差异性与相似性 ,以及对DSM(difference similitudematrix)矩阵元素mdij,msij的特性分析 ,定义了属性的重要度和合并度 ,给出了最佳属性约简集的修正子集的求解方法 ,从而提出了基于DSM的知识约简方法 ,该方法能在保证规则相容的情况下生成少量规则 ,同时只使用部分条件属性 .通过约简UCI机器学习数据库 ,并与粗集理论约简的结果比较 ,表明了该方法的合理性和有效性 ,并在约简效率和规则的正确率上都要好于粗集理论 . By defining the significance and the uniformity of the attributes, and analyzing the elements md ij &s ij in DSM, the important principle of the optimization knowledge reduction and a new data reduction method are put forward.The method can reduce the superfluous data while preserving the consistency of classifications. This data reduction method based on DSM is employed to analyze databases from UCI reposity. Through comparing the reducing result of DSM method and Rough set theory method, it show that DSM method can obtain higher reduction rate of instances. The DSM method is effective in reducing information systems with its higher validity by using leave-one-out' to examine.
作者 江昊 晏蒲柳
出处 《武汉大学学报(理学版)》 CAS CSCD 北大核心 2003年第3期378-382,共5页 Journal of Wuhan University:Natural Science Edition
基金 国家自然科学基金资助项目 ( 90 2 0 40 0 8)
关键词 DSM 知识约简 差异-相似性矩阵 数据约简 粗集理论 UCI机器学习数据库 属性约简集 data reduction DSM (difference similitude matrix) Rough set theory UCI database
  • 相关文献

参考文献1

二级参考文献4

  • 1王珏,苗夺谦,周育健.关于Rough Set理论与应用的综述[J].模式识别与人工智能,1996,9(4):337-344. 被引量:264
  • 2Wang J,J Comput Sci Technol,1998年,13卷,2期,189页
  • 3周育健,硕士学位论文,1996年
  • 4Hu X H,Comput Intell,1995年,11卷,2期,323页

共引文献238

同被引文献7

  • 1YANG Y, PEDERSEN JP. A Comparative Study on Feature Selection in Text Categorization[ A]. Proceedings of the Fourteenth International Conference on Machine Learning[ C]. Tennessee, USA:Vanderbilt University, 1997.
  • 2DUIN RPW, LOOG M, HAEB - UMBACH R. Multi - class Linear Feature Extraction by Nonlinear PCA[ A]. Proceedings of 15th International Conference on Pattern Recognition[ C]. Barcelona, Spain:IEEE Computer Science Press, 2000. 398 -401.
  • 3NGUYEN, SON H. Scalable classification method based on rough sets[ A]. Rough Sets and Current Trends in Computing, Lecture Notes in Computer Science[ C]. PA, USA: Springer, 2002. 433-440.
  • 4夏德麟 晏蒲柳.[D].武汉:武汉大学大学电子信息学院,2001.
  • 5ZHOU JG, XIA DL, YAN PL. Incremental Machine Learning Theorem and Algorithm Based on DSM Method[ A]. Proceedings of the Third International Conference on Machine Learning and Cybernetics[C]. Shanghai: IEEE, 2004. 2202-2207.
  • 6AIZAWA A. The Feature Quantity: An Information Theoretic Perspective of Tfidf-like Measures[ A]. Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval[ C]. Tarrytown, NY, USA: Pergamon Press, Inc, 2000. 104 -111.
  • 7Reuters-21578TextCategorizationCollection[ DB/OL] . http://kdd. ics. uci. edu/databases/reuters21578/reuters21578, html,2004.

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部