摘要
提出一种增量式混合型分类挖掘算法 ,将基于概率论的符号学习与神经网络学习相结合 ,能够对既包含离散属性又包含连续属性的多个概念进行有效的分类处理 ,且具有较强的增量挖掘能力 .该算法在法院决策支持系统中得到了运用 ,取得了较好的效果 .
Classification is a main method of data mining. The purpose of the classification is to find the common specifications from the objects stored in the database, and then, use the schema to classify them. Artificial neutral network, decision tree and legacy algorithm are methods of classification. In the paper, an incremental compound classification algorithm is proposed. Artificial neutral network learning and symbol learning based on the theory of the probability are combined in the algorithm. The main idea of the algorithm is given below. First, the symbol learning algorithm is used to classify a set of training instances by the discrete attributes. When the instances that can't be classified accurately are encounted, FTART network is used to process these instances, and the incessant attributes is made use of in the algorithm to learn new schemas. The incremental learning of the algorithm is based on the component decision tree and the FTART network. When new instances are added, the algorithm only needs to make a single pass of the incremental learning, and the decision tree and the neutral network needn't be regenerated. The algorithm can correctly classify them by easily adjusting the existing structure. Another important advantage is that when a new input schema is added to a trained FTART network, the new network structure can easily be generated by adding some nodes to the second layer of the network, which is different to the traditional BP algorithm. Therefore, the efficiency and the speed of the learning are greatly raised. The main classes of the algorithm are class ROOT and class NODE. ROOT class includes the learning algorithm and the attributes which the decision tree stores in. NODE class is included in the ROOT class, and it is used to define the nodes of the decision tree. The algorithm is based on the two classes. The detail steps of the algorithm are discussed in the paper. The incremental compound classification algorithm can process multi concept collections that contain both discrete attributes and incessant attributes, and it has good performance in the incremental data mining. The algorithm has been tested in the Decision Support System for Court. In the decision support system, a new structure of data mining system based on the data warehouse is adopted to process the different data mining tasks. The incremental compound classification algorithm is a component of its data mining tool set. In the practice, the algorithm achieves good effect.
出处
《南京大学学报(自然科学版)》
CAS
CSCD
北大核心
2001年第2期142-147,共6页
Journal of Nanjing University(Natural Science)
基金
国家自然科学基金! ( 60 0 0 3 0 1 0 )