期刊文献+

一种基于免疫遗传算法的网络新词识别方法 被引量:11

Approach of Internet New Word Identification Based on Immune Genetic Algorithm
下载PDF
导出
摘要 随着互联网的发展,网络新词不断涌现,但是目前的分词方法很难及时、准确地对其做出识别。对此提出一种应用免疫遗传算法的网络新词识别方法。在分析网络新词特点的基础上,利用汉语词群现象和词位的概念提取出示范抗体,在遗传算法进行的过程中有针对性地注入该抗体。实验表明,该方法对于分词碎片中符合词群现象的新词有着极高的识别率,对于一般网络新词的识别率也基本令人满意。 The development of Internet leads the internet new word coming into being.These unknown words are difficult to identify timely and accurately by the current Word Segmentation Method,therefore internet new word identification method using Immune genetic algorithm was brought forward.This method is based on the analysis of characteristics of internet new word,using the phenomenon of Chinese words and word groups to extract exemplary antibody,and injecting the antibody targeted during the process of genetic algorithm.The experiment results show that the method not only has a higher recognition rates of the new words consistent with the phenomenon of word groups in word fragments but the result of identifying ordinary internet new word is adequate.
出处 《计算机科学》 CSCD 北大核心 2011年第1期240-245,共6页 Computer Science
基金 国家高技术研究发展计划(863)(2006AA12A106) 国家自然科学基金(60879015 60572167)资助
关键词 免疫遗传算法 汉语词群 词位 抗体 网络新词识别 Immune genetic algorithm Word group Word position Antibody Internet new word identification
  • 相关文献

参考文献9

二级参考文献24

  • 1邹纲,刘洋,刘群,孟遥,于浩,西野文人,亢世勇.面向Internet的中文新词语检测[J].中文信息学报,2004,18(6):1-9. 被引量:59
  • 2郑家恒 李文花.新词语自动识别方法研究.自然语言理解与机器翻译[M].北京:清华大学出版社,2001..
  • 3陆志苇.现代汉语构词法(修订本)[M].北京:中华书局,1975..
  • 4李敏强.遗传算法的基本理论与应用[M].北京:科学出版社,2003..
  • 5刘少辉,董明楷,张海俊,李蓉,史忠植.一种基于向量空间模型的多层次文本分类方法.(该论文已被中文信息学报录用)中科院计算所智能信息处理开发实验室
  • 6Zhang Jian,Gao Jianfen,Zhou Ming.An Experimental Study on a Very Large Corpus.Microsoft Research ,China
  • 7Gotoh Y,Renals S.Variable Word Rate N-GRAMS.University of Sheffield, Department of Computer Science
  • 8Hua- Ping ZHANG, Qun LIU. et al, Chinese Name Entity Recognition Using Role Model[ J]. Special issue ''Word Formation and Chinese Language processing'' of the International Journal of Computational Linguistics and Chinese Language Processing, 2003, 8(2):2
  • 9Craig G. Nevill - Manning, Ian H. Witten. Identifying Hierarchical Structure in Sequences: A linear - time algorithm [J]. Journal of Artificial Intelligence Research, 1997, 7:67- 82
  • 10K.J.Chen,Ming-Hong Bai.Unknown word detection for Chinese by a corpus-based learning method.International Journal of Computational Linguistics and Chinese Language Processing,1998,3 (1):27~44

共引文献117

同被引文献249

引证文献11

二级引证文献49

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部