期刊文献+

一种基于资源优化神经网络(RON)的文本分类方法

A CLASSIFICATION METHOD FOR CHINESE TEXT BASED ON RESOURCE-OPTIMIZING NEURAL NETWORKS(RONN)
下载PDF
导出
摘要 应用有指导的机器学习方法实现了一个文本分类器。运用改进型的CHI统计量方法对分词结果进行特征提取,对传统的TF-IDF加权公式进行了一些改进(称之为:ETF-IDF),运用资源优化神经网络RON(Resource-optimizing Networks)构建分类器。在复旦大学提供的中文文本分类语料库上进行分类实验,实验结果表明该分类器较之BP算法有较高的分类质量,且ETF-IDF加权公式较之传统的TF-IDF加权公式有其优越性,提高了分类的精度和性能,满足了中文文本自动分类的要求。 In this paper the supervised machine learning theory is made use of to implement a text classifier.The method can be conducted as follows,the improved chi statistic method is used to extract the feature of text segmentation results,some improvements are made on traditional TF-IDF Weight Formula(named ETF-IDF),and the classifier is constructed using Resource-optimizing neural networks(RONN).Classification experiments are carried out on Chinese text classified corpus of Fudan University,and the results show that the classifier we constructed performs better in classification quality than BP network,and the ETF-IDF Weight Formula prevails against traditional TF-IDF Weight Formula in text classification,it improves the precision and performance of the classifier,and the requirement of automatic classification of the Chinese text is then met.
出处 《计算机应用与软件》 CSCD 2010年第7期33-36,共4页 Computer Applications and Software
基金 国家重点基础研究973计划项目(2004CB318108 2007CB311003) 国家自然科学基金项目(60675031)
关键词 文本分类 CHI统计量 RON 资源优化神经网络 Text classification CHI statistic Resource-optimizing network(RON) Resource-optimizing neural networks
  • 相关文献

参考文献10

二级参考文献103

  • 1翟林,刘亚军.支持向量机的中文文本分类研究[J].计算机与数字工程,2005,33(3):21-23. 被引量:14
  • 2陈涛,谢阳群.文本分类中的特征降维方法综述[J].情报学报,2005,24(6):690-695. 被引量:79
  • 3樊兴华,孙茂松.一种高性能的两类中文文本分类方法[J].计算机学报,2006,29(1):124-131. 被引量:70
  • 4张翔,肖小玲,徐光祐.基于样本之间紧密度的模糊支持向量机方法[J].软件学报,2006,17(5):951-958. 被引量:84
  • 5冯是聪 单松巍 张志刚 等.一个中文网页数据集及其分类体系[A]..海峡两岸技术交流会[C].南京,2002-10.121-129.
  • 6中国互联网络信息中心.第十三次《中国互联网络发展状况统计报告》[R].,2004,1..
  • 7上海艾瑞市场咨询公司.中国反垃圾邮件市场研究报告[R].,2003,11..
  • 8Yiming Yang,Jan O Pedersen.A comparative Study on Feature Selection in Text Categorization[C].In :Proceedings of the Fourteenth International Conference on Machine Leaming(ICML'97), 1997.
  • 9Yiming Yang,Xin Liu.A re-examination of text categorization methods[C].In:Proceedings of ACM SIGIR Conference on Research and Development in Information Retrieval SIGIR'99,1999:42---49.
  • 10Yiming Yang.A study on thresholding strategies for text categorization[C].In:Proceedings of ACM SIGIR Conference on Research and Development in Information Retrieval(SIGIR'01),2001.

共引文献100

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部