期刊文献+

基于模糊-粗糙集的文本分类方法 被引量:8

Text Categorization Method Based on the Theory of Fuzzy-rough Sets
下载PDF
导出
摘要 在文本分类过程中,类别之间的重叠以及标志类别属性的不足会导致类别的边界之间出现模糊不确定性和粗糙不确定性,而传统的k-近邻方法无法解决这一问题;同时,在传统的k-近邻方法以及其他一些改进的k-近邻方法中,最优七值的选取需要通过训练得到.文中借助模糊-粗糙集理论来改进传统的k-近邻方法,并使用基于距离的邻城空间,以不经训练地确定适宜每个待分类文本的k-值,最后将所提方法和其他一些k-近邻方法进行了实验比较,结果表明模糊-粗糙集方法能够在一定程度上提高分类的精度和召回率. In the text categorization process, fuzzy-uncertainty and rough-uncertainty would appear due to the overlapping of classes and the lack of features. These two kinds of uncertainty can not be dealt with by the conventional k-nearest neighbor (k-NN) method. Moreover, with the conventional k-NN method and other improved k-NN methods, the optimal value of k can only be obtained through training. To solve this problem, the theory of fuzzy-rough sets is employed to improve the conventional k-NN method. Also, the concept of distance based neighbor space is employed to obtain the fit value of k for each text to be classified. A comparison between the proposed method and other existing k-NN methods is finally made by experiments. It is concluded that the method based on the theory of fuzzy-rough sets can promote the precision and recall rate of text categorization to a certain degree.
出处 《华南理工大学学报(自然科学版)》 EI CAS CSCD 北大核心 2004年第z1期73-76,共4页 Journal of South China University of Technology(Natural Science Edition)
关键词 模糊-粗糙集 模糊-粗糙隶属函数 k-近邻方法 文本分类 邻域空间 fuzzy-rough set fuzzy-rough membership function k-nearest neighbor method text categorization neighbor space
  • 相关文献

参考文献7

  • 1[1]Dubois D,Prade H. Putting rough sets and fuzzy sets together [A]. Intelligent Decision Support: Handbook of Applications and Advanced of the Rough Set Theory [C].Boston: Slowinski R ED, Kluwer Academic Publishers, 1992. 203 - 222.
  • 2[2]Yao Y Y. A comparative study of fuzzy sets and rough sets [J]. Information Sciences, 1998,109 (1-4): 227 -242.
  • 3曾黄麟.粗集理论及其应用--关于数据推理的新方法[M].重庆:重庆大学出版社,1998..
  • 4[4]Keller J M, Gray M R, Givens J A. A fuzzy k-nearest neighbor algorithm [J]. IEEE Transactions on System Man and Cybernetics, 1985,15 (4) :580 - 585.
  • 5[5]Yang Y,Pederen J P. A comparative study on feature selection in text categorization [A]. Proceeding of the Fourteenth International Conference on Machine Learning (ICML97) [C]. Nashville Tennessee USA :Morgan Kaufmann, 1997.412 - 420.
  • 6[7]Denoeux T. A k-nearest neighbor classification rule based on Dempster-Shafer theory [J]. IEEE Transactions on System Man and Cybernetics, 1995,25(5):804 -813.
  • 7[8]Francois J, Grandvalet Y, Denoeux T, et al. Resample and combine:An approach to improving uncertainty representation in evidential pattern classification [J]. Information Fusion,2003 (4) :75 -85.

共引文献5

同被引文献237

引证文献8

二级引证文献44

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部