摘要
在综合考虑特征整体与局部分布基础上,提出一种高性能的文本特征自动提取算法。算法引入云隶属度概念对特征分布进行修正,不需任何先验知识,能根据特征分布特点自动获取云隶属度高的特征集。实验结果表明:该特征集具有特征个数少、分类精度高的特点,性能明显比当前主要的特征选择方法的性能优。
Combining the overall with the local distribution of features in categories,a high performance algorithm for feature automation selection(Named FAS) was proposed.By using FAS,the feature set was obtained automatically and the distribution of features was amended by using cloud model theory.The results show the selected feature set has fewer features and better classification performance than the existing methods.
出处
《中南大学学报(自然科学版)》
EI
CAS
CSCD
北大核心
2011年第3期714-720,共7页
Journal of Central South University:Science and Technology
基金
国家重大科技专项子课题(2008ZX07315-001)
重庆市重大科技专项(2008AB5038)
中央高校基本科研业务资助项目(CDJXS11181160)
关键词
文本分类
特征提取
云模型
隶属度
动态聚类
text classification
feature selection
cloud model
membership degree
dynamic clustering