摘要
分类是数据挖掘中的重要任务之一,稀有类分类问题是分类中的一个重要分支,可以描述为从一个分布极不平衡的数据集中标识出那些具有显著意义却很少发生的实例,在现实生活中的很多领域都有广泛的应用。详细地介绍了稀有类分类的问题,探讨了稀有类分类的一些特征、影响稀有类分类的一些因素和对稀有类分类进行评估的标准,介绍了当前分类稀有类的主要方法:基于数据集的方法和基于算法的方法。介绍了当前几种流行的稀有类分类算法。
Classification is an important task in data mining.Rare classification is a part of classification and it can be described as identifying the instance with statistical significance from imbalanced datasets.The classification of rarely occurring cases is widely used in many real life applications.Introduce the question of rare classification and discuss the features and general criteria of rare classification,and also study the popular methods to classify rare cases: based on data level and algorithm level.In the last introduce the popular algorithm of rare classification.
出处
《计算机技术与发展》
2010年第7期250-252,F0003,共4页
Computer Technology and Development
基金
河南省自然科学基金(0211050100)
关键词
分类
稀有类
显露模式
两阶段分类
classification
rare class
emerging pattern
two-phase classification