摘要
目的特征集中特征质量的好坏能够影响到文本分类的精度,所以选择一种好的特征选择方法对于文本分类的效果起着重要的作用.方法粗糙集理论为研究不精确数据的分析、推理,挖掘数据间的关系、发现潜在的知识提供了有效的工具.提出了一种基于粗糙集的特征选择方法.结果通过实验结果表明该方法利用粗糙集的约简理论降低了特征维数,同时保证了分类性能.使用该方法进行特征选择时比目前常用的特征选择方法获得较好的分类效果.结论粗糙集的属性约简理论可以用在规则提取和特征选择上,利用粗糙集的属性约简理论进行特征选择时能够获得较理想的分类效果.
Objective Characteristic's quality can affect precision of the text classification, therefore, it is important to choose one good feature selection method in the text classification. Methods Rough Set Theory is an effective tool used to analyze imprecise data and discover the latent rules in these data. In this paper a feature selection method based on rough set is presented. Results The experiment result indicated that this method could reduce the characteristic dimension by using reduction theory of rough set and guarantee th performance of text categorization. The using of method could get a better classified effect compared with the present method when getting the feature selection. Conclusion Rough set's attribute reduction theory may be used in the rule extraction and the feature selection. Using the rough set's attribute reduction theory, the ideal classified effect can be obtained when conducting a feature selection.
出处
《河北北方学院学报(自然科学版)》
2009年第1期56-59,共4页
Journal of Hebei North University:Natural Science Edition
关键词
文本分类
特征选择
粗糙集
约简
属性
text categorization
feature selection
rough sets
reduction
attribute