期刊文献+

基于免疫算法的文本分类研究 被引量:6

Research of Text Categorization Based on Immune Algorithm
下载PDF
导出
摘要 借鉴免疫的生物学机理,本文提出了一种基于抗体浓度的克隆选择算法,该算法中抗体的选择概率由亲和度与浓度共同决定,具有高亲和度和低浓度的抗体才受到促进。该算法在文本分类领域得到了成功应用。在文本分类的应用中,抗原、B细胞和抗体分别对应训练文本、分类器的一个解和分类器的解与训练文本的亲和度,最后训练完成的分类器含有多个记忆细胞,有效保证了解的多样性。在数据集20_newsgroups上的实验结果显示,该方法的综合性能指标F1可达80.90%,优于Rocchio法与Naive Bayes法。 The clonal selection principle and density control mechanism are used by the natural immune system to define the features of an immune response to an antigenic stimulus. It establishes the ideas that only those ceils that have higher affinity and lower density are selected to proliferate. A new algorithm, called the Clonal Selection Algorithm Based on Antibody Density (CSABAD), is brought forward and successfully implemented in text categorization. In text categorization, antigen, B cell and antibody are respectively corresponded with training text, an individual of classifier and affinity between the individual and training texts. The final classifter is composed with many memory B cells. The method is applied to the 20_newsgroups dataset and we obtains a F1 score of 80.90%. The result shows that CSABAD significantly outperform Rocchio and Naive Bayes.
出处 《微计算机信息》 北大核心 2007年第24期210-212,共3页 Control & Automation
基金 国家自然科学基金资助项目(90412015) 国家发改委项目(CNGI-04-12-2A)
关键词 文本分类 免疫 克隆选择 抗体浓度 Text categorization, Immune, Clonal selection, Antibody density
  • 相关文献

参考文献5

  • 1The 20_newsgroups Dataset[DB/OL]. http://www.cs.cmu.edu/afs/ cs/project/theo- 11/www/naive-bayes/20_newsgroups.tar.gz
  • 2Fabrizio Sebastiani. Machine learning in automated text categorization[J]. ACM Computing Surveys, 2002, 34(1): 1-47
  • 3J.E.Hunt, D.E.Cooke. Learning using an artificial immune system[J]. Journal of network and computer application, 1996, 19: 189-212
  • 4Leandro Nunes de Castro, Femando J. Von Zuben. The Clonal Selection Algorithm with Engineering Applications[C]. In: Proceedings of GECCO'00, Las Vegas, USA, 2000, 7:36-37
  • 5杨丽华,戴齐,杨占华.文本分类技术研究[J].微计算机信息,2006(05X):209-211. 被引量:13

二级参考文献7

  • 1张先飞,李弼程,刘安斐.基于改进KNFL算法的海量文本分类研究[J].微计算机信息,2005,21(11S):159-160. 被引量:4
  • 2AH-HWEE TAN.Text Mining:The state of the art and the challenges [C].PAKDD'99 Workshop on Knowledge discovery from Advanced Databases (KDAD'99),Beijing,1999.
  • 3Fabrizio Sebastiani.Machine Learning in Automated Text Categorization[J].ACM Computing Sruveys,2002,34(1):1-47.
  • 4Yang Yiming,Pederson J O.A Comparative Study on Feature Selection in Text Categorization[C].Proceedings of the 14th International Conference on Machine learning.Nashville:Morgan Kanfmann,1997: 412-420.
  • 5Mlademnic,D.,Grobelnik,M.Feature Selection for unbalanced class distribution and Native Bayees [C].Proceedings of the Sisteenth International Conference on Machine Learning.Bled:Morgan Kanfmann, 1999:258-267.
  • 6Belur V D.Nearest Neighbor(NN)Norms:NN pattern Classification Techniques [J].IEEE Computer Society Press,New York:IEEE press, 1991.59.
  • 7Joachims T.Text Categorization with Support Vector Machines:Learning with Many Relevant Features [J].Machine Learning,1998,11398:137-142.

共引文献12

同被引文献31

引证文献6

二级引证文献9

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部