摘要
为处理有测量误差的数据,并且考虑到误分类代价不仅与样本相关,而且与测试代价紧密相关,提出了一种代价敏感最优误差边界选择方法。根据不同的误差边界自适应生成测试代价及误分类代价,以最小化总代价为目标,设计了基于不同误差边界的属性选择方法,进而进行最优误差边界选择。在四个UCI标准数据集上的实验分析显示,该设计方案可有效地选出最优误差边界,以保证所选属性集合具有最小的平均总代价。
This paper proposes a cost-sensitive optimal error bound selection approach to address data with measure- ment errors. The considered misclassification costs are sensitive to both examples and test costs. With this in mind, misclassification costs and test costs are adaptively computed according to error bound. In order to minimize aver- age total cost, this paper designs an algorithm for cost-sensitive optimal error bound selection. The experimental results on four UCI databases show that the designed algorithm can select the feature set with the optimal error bound. The selected feature set leads to the lowest average total cost.
出处
《计算机科学与探索》
CSCD
2013年第12期1146-1152,共7页
Journal of Frontiers of Computer Science and Technology
基金
福建省教育厅科技重点项目
福建省教育厅科技项目
漳州市自然科学基金~~
关键词
粗糙集
测量误差
代价敏感
属性选择
rough set
measurement error
cost sensitive
feature selection