期刊文献+

基于粗糙集和模糊聚类理论的文本分类系统的研究与实现 被引量:11

Research and Implementation of Text Classification System Based on Theories of Rough Set & Fuzzy Clustering
下载PDF
导出
摘要 随着Internet的发展及广泛应用,越来越多的文本信息以待阅读和处理。文本分类成为众所关注但仍未很好解决的热门课题。本文提出一种基于粗糙集和模糊聚类(RS&FC)理论的文本分类新模型,详细讨论和分析了该模型的总体设计思想、主要实现技术和有关的算法及实现方案。该模型在分类规则产生之前,以训练样本直接聚类的结果构造信息表,并对表中的连续属性离散化,再对信息表中的特征词属性进行二次聚类,压缩文本特征子集的向量维数,提取关键字特征属性,建立决策信息表,然后利用粗糙集理论,采用启发式约简算法,对信息表进行约简,产生优化的分类规则,指导文本分类。最后通过实验和性能评价,本文提出的分类方法的分类准确率高于传统的K-最近邻分类(K-NN)法,提高了系统的适应性能和分类能力。 Along with the development and widespread application of Internet, more and more text information will be read and dealt with. The text categorization has become a hot research subject which has attracted a lot of attention but still not solved well. In the paper, a new text categorization model is built based on the theories of the Fuzzy Cluster and Rough Set (RS&FC); the overall design idea, major implementation technique , related algorithm and implementation scheme of the model are discussed in detail; the model constructs the information table by the result of clustering the training text, makes the continuous attributes discreted, clusters again the feature words attributes, reduces vector dimensions of text feature sets ,abstracts feature attributes of key words, builds the decision information table , reduces finally the information table by the rough set theory and heuristic reduction algorithm and gets optimal classifying rules. Experiments and evaluation of performance show that the text categorization precision based on the fuzzy cluster and rough set is superior to the existing K Nearest Neighbor(K-NN) method. The adaptability and classifying ability of the system are improved.
出处 《铁道学报》 EI CAS CSCD 北大核心 2007年第1期45-49,共5页 Journal of the China Railway Society
基金 光电技术与智能控制教育部重点实验室(兰州交通大学)开放基金资助项目(K040103) 甘肃省自然科学基金项目(3ZS042-B25-038)
关键词 粗糙集 模糊聚类 文本分类 文本聚类 规则约简 rough set fuzzy clustering text classification text clustering rule reduction
  • 相关文献

参考文献5

二级参考文献7

共引文献35

同被引文献60

引证文献11

二级引证文献97

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部