摘要
分类(Classification)是数据挖掘(DataMining)中的一个重要研究方向,目前传统的方法有神经网络,Fisher判别法等。神经网络缺乏对分类结果的直观解释;Fisher判别对于大数据集分类准确率大大下降,且不具有属性约简能力。为此,该文做了如下工作(1)提出了自动获取最佳阈值的思想;(2)对于错分的实例,提出了运用神经网络分类器二次分类的思想;(3)提出了基于基因表达式编程和神经网络的属性约简分类算法(AttributionReductionClassificationAlgo-rithmsBasedonGEPandNeuralNetwork,ARCA-GEPNN);(4)实验表明,ARCA-GEPNN的分类精度比Fisher判别提高了约25%,比GEP提高了约21%。
Classification is an important research direction of Data Mining.The traditional classification methods,such as Neural Network,Fisher Decision etc,have some shortages as follows.The Neural Network method cannot explain the classification results expressly.The Fisher Decision method cannot deal with the large data sets accurately,and cannot reduce the attributions effectively.To solve the problems,this paper makes the following contributions: (1)Proposing a new concept of automatic obtaining the best threshold; (2)Proposing an idea of classification based on BP Neural Network to deal with the data classified by Gene Expression Pregramming(GEP) method incorrectly; (3)Proposing Attribution Reduction Classification Algorithms Based on GEP and Neural Network; (4)By extensive experiments over ARCA-GEPNN and other traditional methods,the results show that classification precision of ARCA-GEPNN is improved by about 25% than Fisher Decision method,while about 21% by contrast to GEP.
出处
《计算机工程与应用》
CSCD
北大核心
2006年第23期154-157,172,共5页
Computer Engineering and Applications
基金
国家973重点基础研究发展规划资助项目(编号:2002CB111504)
广西省自然科学基金资助项目(编号:0339039)