摘要
决策树分类算法在数据挖掘领域是一种高效且应用普遍的分类算法。传统的决策树算法难以处理数据中存在的模糊性等不确定性信息,模糊决策树作为经典决策树在模糊集理论上的扩展,可有效克服这一缺陷。然而,现有的模糊决策树算法在处理具有层次结构的标签数据时,一般选取层次结构的某一层标签去分类数据,导致当分类准确率高时,标签不具体;标签具体时,分类准确率低,无法有效做到在分类准确率尽可能高的情况下,层次标签也尽可能具体。提出了一种基于层次标签数据的模糊决策树构造算法来解决以上问题,结合模糊ID3算法和层次信息增益思想对数据进行分类,并在构建过程中充分考虑了标签的层次。最后通过实验与传统模糊决策树算法对比,说明了所提算法的有效性。
Decision tree is an efficient and widely used classification algorithm in the field of data mining.Traditional classic decision tree algorithms were difficult to deal with uncertain information.Such as the data with ambiguity.Fuzzy decision tree,as an extension of classic decision tree in fuzzy set theory,could overcome this defect effectively.However,when the existing fuzzy decision tree algorithm was used to processed data with a hierarchical structure of labels,it selected a certain layer of hierarchical structure to classify the data generally.As a result,when the classification accuracy was high,the label was not specific;when the label was specific,the classification accuracy was low.It was impossible to achieve the label as specific as possible effectively when the classification accuracy was as high as possible.A fuzzy decision tree construction algorithm based on hierarchical labels data was proposed to solve the above problems.The algorithm combined the fuzzy ID3 algorithm and the idea of hierarchical information gaining to classify the data,and fully considered the level of the labels in the construction process.Finally,the comparison between the experiment and the traditional fuzzy decision tree algorithm showed the effectiveness of the proposed algorithm.
作者
王忠
折延宏
郑逸
WANG Zhong;SHE Yanhong;ZHENG Yi(Department of Computer,Xi′an Shiyou University Xi′an 710065,China;Department of Science,Xi′an Shiyou University Xi′an 710065, China)
出处
《郑州大学学报(理学版)》
北大核心
2022年第2期24-31,共8页
Journal of Zhengzhou University:Natural Science Edition
基金
国家自然科学基金项目(61976244)
陕西省自然科学基金项目(2021JQ-580)。
关键词
分类
模糊集
模糊决策树
层次标签
classification
fuzzy set
fuzzy decision tree
hierarchical label