期刊文献+

文本分类的关联规则辅助遗传算法(英文)

The Association Rules Aided Genetic Algorithm for Text Classification
下载PDF
导出
摘要 为避免信息超载而在过多的无用信息中迷失方向,信息检索的重要性日益提高。文本自动分类是信息检索中最重要的工具之一。提出了一个用于中文文本自动分类的、称为关联规则辅助的遗传计算方法(AssociationRulesAidedGeneticComputingMethod,缩写为ARGCM)。提出并实现了文本分类的关联规则辅助的遗传算法;不同于前人的路线,适应度函数的编码借助了关联规则,而关联规则通过此文提出的ARGACM算法挖掘;实现了并测试了一系列基础遗传过程,例如AGACMRouletteSelection过程,AGACMXover过程和AGACMbinaryMutatio过程;实验结果表明新的ARG算法性能远优于传统的算法,其中向量AB Vector经过50代ARG算法的进化后,获得了高达3513.6的评分。 Information overload is a serious issue in the modern society. As a powerful method to help people out of being 'lost' in too much useless information, Information Retrieval is getting more and more important. Automatic text classification is one of the most important tools in Information Retrieval. This article proposes a new text classification method called Association Rules Aided Genetic Computing Method (ARGCM). The main contribution includes:1) The Association Rules Aided Genetic Algorithm (ARGA) for text classification,2) Different from existing work, the fitness function are coded under the assistance of the association rules mined by AprioriARGACM algorithm,3) Implementing the genetic procedures: AGACMRouletteSelection, AGACMXover and AGACMbinaryMutation and giving extended experiments.4)The experimental results show that the ARG algorithm is superior to other common methods. A B-Vector with a score 3513.6 can be achieved after running ARG algorithm after 50 generations.
出处 《四川大学学报(工程科学版)》 EI CAS CSCD 2004年第3期1-8,共8页 Journal of Sichuan University (Engineering Science Edition)
基金 国家自然科学基金资助项目(60073046) 973资助项目(2002CB111504) 博士点基金资助项目(20020610007)
关键词 关联规则 中文文本分类 遗传算法 关联规则辅助的计算方法 association rules Chinese text classification genetic algorithm natural language processing ARGCM(Association Rules Aided Computing Method)
  • 相关文献

参考文献15

  • 1范焱,郑诚,王清毅,蔡庆生,刘洁.用Naive Bayes方法协调分类Web网页[J].软件学报,2001,12(9):1386-1392. 被引量:53
  • 2Apte C,Damerau F,Weiss S.Text mining with decision rules and decision trees[A].Workshop on Learning from text and the Web[C], Conference on Automated Learning and Discovery,1998.
  • 3Apte C,Damerau F,Weiss S M.Automated learning of decision rules for text categorization[J].ACM Transactions on Information Systems,1994,12(3):233-251.
  • 4Diederich J,kindermann J,Leopold E,et al.Authorship attribution with support vector machines[A].Applied Intelligence,2000.Submitted.
  • 5朱华宇,孙正兴,张福炎.一个基于向量空间模型的中文文本自动分类系统[J].计算机工程,2001,27(2):15-17. 被引量:45
  • 6Hu Jiangtao,Zhou Shuigen,Zhou Aoying.Chinese web documents classification based on genetic algorithm[A].NDBC,2001.113-116(in Chinese).
  • 7Tang Changjie,Lau R W H,Li Qing,et al.Distance courseware discrimination based on representative sentence assaying[A].Proceedings of DASFAA01[C], Seven-th International Conference of Advanced Database Applications.Hong Kong:IEEE Publishing IEEE Sponsored,
  • 8Tang Changjie,Li Tong,Liu Changyu,et al.Classify web document by key phrase understanding[A].Wang X S,YU G,Lu H.LNCS(Lecture Notes In Computer science)[C].WIAM2001(International conference for Web Information Age 2001),Springer Verlag Berling Heidelberg,2
  • 9LI Tong, TANG Changjie,ZUO Jie,et al.Web document filtering technique based on natural language understanding[J].International Journal Computer Processing of Oriental Language,2001,14(3):279-291.
  • 10Zuo Jie, Tang Changjie,Zhang Tianqing.Mining predicate association rule by gene expression programming[A].Meng Xiaofeng,Su Jianwne,Wang Yujun.LNCS (Lecture Notes In Computer science)[C].WAIM02 (International Conference for Web Information Age 2002),Spring

二级参考文献10

  • 1中国统计局.中国统计年鉴[M].北京:中国统计出版社,1987..
  • 2邹涛.基于WWW的信息发现技术研究(博士学位论文)[M].南京:南京大学,1999..
  • 3邹涛,博士学位论文,1999年
  • 4Yang Y,Information Retrieval J,1999年
  • 5Yang Y,INRT J,1998年
  • 6吴立德,大规模中文文本处理,1997年,7页
  • 7陈世福,人工智能与知识工程,1997年,391页
  • 8Yang Y,ACM Transactions on Information Systems,1994年
  • 9中国统计局,中国统计年鉴,1987年
  • 10Lang K,Proc the 12th Int Conference on Machine Learning(ICML 95),1995年,331页

共引文献141

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部