期刊文献+

基于信息熵的粗糙集连续属性多变量离散化算法 被引量:2

Multiple Variable Discretization Algorithm of Continuous Attributes in Rough Set Theory Based on Information Entropy
下载PDF
导出
摘要 属性离散化能够降低问题的复杂度,得到更加简短、精确且易于理解的规则。针对现有离散化方法在选择断点时没有考虑属性间和属性内断点的互斥性且不能保证保持决策表的不可分辨关系,本研究提出一种新的基于信息熵的粗糙集连续属性多变量离散化算法(PAD)。它以信息熵作为选择断点的衡量标准,以不可分辨关系为停止标准并引入5条断点预选确选策略。实验结果表明,引入断点预选、确选策略的PAD算法与Ros-tta软件中的5个离散化算法相比,具有较高的预测精度和较少的断点数目。 Attribute discretization can reduce the problem complexity,and obtain more brief,accurate and comprehensible rules.The existing discretization methods in selecting breakpoint don't take into consideration of the mutual exclusion of the ones among and within the attributes,therefore cannot maintain the indiscernibility relation of decision table.In this paper a new multiple variable discretization algorithm is proposed for continuous attributes in rough set theory based on information entropy(PAD).The new algorithm employs information entropy as a measure to choose breakpoint,takes indiscernibility relation as the stopping criterion and introduces five strategies for breakpoint pre-selection and final selection.Experimental results show that PAD algorithm can get higher precision accuracy and less breakpoint number compared with five discretization algorithms employed in Rostta software.
作者 王举范 陈卓
出处 《青岛科技大学学报(自然科学版)》 CAS 北大核心 2013年第4期423-426,共4页 Journal of Qingdao University of Science and Technology:Natural Science Edition
基金 国家自然科学基金项目(61273180)
关键词 粗糙集 不可分辨关系 离散化 信息熵 rough sets indiscernibility discretization information entropy
  • 相关文献

参考文献11

  • 1Pongaksorn P, Rakthanmanon T, Waiyamai K. DCR: dis cretization using class information to reduce number of inter- vals[C]//QIMIE'09: Quality issues, measures of interest- ingness and evaluation of data mining models, 2009:17-28.
  • 2Elomaa T, Rousu J. Efficient multisplitting revisited: Opti ma-preserving elimination of partition candidates [J]. Data Mining and Knowledge Discovery, 2004,8(2): 97 -126 .
  • 3Liu H, Hussain F,Tan C L,et al. Discretization: An enab- ling technique[J]. Data Mining and Knowledge Discovery, 2002,6(4) :393 423.
  • 4Jin R, Breitbart Y, Muoh C. Data discretization unification [J]. Knowl Inf Syst, 2009,19(1) :1-29.
  • 5Kerber R. ChiMerge: Discretization of numeric attributes [C]//X National Conf on Artificial Intelligence American Association (AAAI92), USA, 1992 : 123-128.
  • 6Boulle M K. A statistical discretization method of continuous at- tributes[J]. Machine I,earning, 2004, 55(1) : 53-69.
  • 7Fayyad U M, Irani K B. On the handling of continuous val- ued attributes in decision tree generation[J]. Machine Learn- ing, 1992,8: 87-102.
  • 8Catlett J. On changing continuous attributes into ordered dis crete attributes[C]//Proceedings of the European Working Session on Learning, 1991: 164-178.
  • 9Chan C C, Batur C, Srinivasan A. Determination of quanti- zation intervals in rule based model for dynamie[C]//Pro- ceedings ofthe IEEE Conference on Systems, Man and Cy- bernetics, Charlottesvile, Virginia,1991:1719 1723.
  • 10Monti S, Cooper G. A multivariate discretization method for learning bayesian networks from mixed data[C]//Pro- ceedings of the Fourteenth Conference of Uncertainty in AI, 1998~ 404-413.

同被引文献18

引证文献2

二级引证文献10

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部