不平衡数据的出现给传统关联分类算法带来了巨大的挑战.为了提高关联分类方法对不平衡数据集的分类精度,本文分别从数据和规则层次着手,提出了关键值抽样法(key value sampling,KVS)和规则验证法(rule validation,RV).关键值抽样法通过...不平衡数据的出现给传统关联分类算法带来了巨大的挑战.为了提高关联分类方法对不平衡数据集的分类精度,本文分别从数据和规则层次着手,提出了关键值抽样法(key value sampling,KVS)和规则验证法(rule validation,RV).关键值抽样法通过增加与少数类相关性强的数据,减少与多数类相关性弱的数据来达到数据类分布平衡.避免了大量有效信息的流失,并且增强了与少数类相关性强的数据信息.规则验证法对初步生成的分类器进行了规则验证,并对分类性能不好的规则进行调整,从而保证了分类器中规则的质量.实验表明,本文中的研究方法能够有效提高关联分类方法处理不平衡数据的精度.展开更多
In this paper, we propose an enhanced associative classification method by integrating the dynamic property in the process of associative classification. In the proposed method, we employ a support vector machine(SVM...In this paper, we propose an enhanced associative classification method by integrating the dynamic property in the process of associative classification. In the proposed method, we employ a support vector machine(SVM) based method to refine the discovered emerging ~equent patterns for classification rule extension for class label prediction. The empirical study shows that our method can be used to classify increasing resources efficiently and effectively.展开更多
文摘不平衡数据的出现给传统关联分类算法带来了巨大的挑战.为了提高关联分类方法对不平衡数据集的分类精度,本文分别从数据和规则层次着手,提出了关键值抽样法(key value sampling,KVS)和规则验证法(rule validation,RV).关键值抽样法通过增加与少数类相关性强的数据,减少与多数类相关性弱的数据来达到数据类分布平衡.避免了大量有效信息的流失,并且增强了与少数类相关性强的数据信息.规则验证法对初步生成的分类器进行了规则验证,并对分类性能不好的规则进行调整,从而保证了分类器中规则的质量.实验表明,本文中的研究方法能够有效提高关联分类方法处理不平衡数据的精度.
基金Supported by the National High Technology Research and Development Program of China (No. 2007AA01Z132) the National Natural Science Foundation of China (No.60775035, 60933004, 60970088, 60903141)+1 种基金 the National Basic Research Priorities Programme (No. 2007CB311004) the National Science and Technology Support Plan (No.2006BAC08B06).
文摘In this paper, we propose an enhanced associative classification method by integrating the dynamic property in the process of associative classification. In the proposed method, we employ a support vector machine(SVM) based method to refine the discovered emerging ~equent patterns for classification rule extension for class label prediction. The empirical study shows that our method can be used to classify increasing resources efficiently and effectively.