摘要
决策树是数据挖掘中常用的分类方法。针对高等院校学生就业问题中出现由噪声造成的不一致性数据,本文提出了基于变精度粗糙集的决策树模型,并应用于学生就业数据分析。该方法以变精度粗糙集的分类质量的量度作为信息函数,对条件属性进行选择,作为树的节点,自上而下地分割数据集,直到满足某种终止条件。它充分考虑了属性间的依赖性和冗余性,允许在构造决策树的过程中划入正域的实例类别存在一定的不一致性。实验表明,该算法能够有效地处理不一致性数据集,并能正确合理地将就业数据分类,最终得到若干有价值的结论,供决策分析。该算法大大提高了决策规则的泛化能力,减化了树的结构。
Decision tree is a usual method of classification in data mining.In this paper,a new heuristic function to build decision trees based on the variable precision rough set is proposed for the inconsistency in the employment of university graduates.The measure of quality of classification acts as an information function to select the condition attribute in this method,and the condition attribute is to be the decision tree node to divide the data set.The dependency and redundancy between attributes are considered;especially a certain inconsistency is allowed to exist in the examples of the positive regions.The method classifies the data of employment correctly and finds some valuable results for analysis and decision,and it simplifies the decision trees and improves the extensive ability of decision rules.
出处
《计算机工程与科学》
CSCD
北大核心
2011年第5期141-145,共5页
Computer Engineering & Science
基金
洛阳师范学院教学改革项目(2008-26)
河南省教育厅自然科学研究计划项目(2010A520030)
关键词
决策树
变精度粗糙集
学生就业
决策规则
decision tree
variable precision rough set
employment of university graduates
decision rules