期刊文献+

基于SVM的生物医学命名实体的识别 被引量:18

SVM-based biomedical name entity recognition
下载PDF
导出
摘要 命名实体识别是生物医学数据挖掘的基本任务.文章使用了基于支持向量机的方法对生物医学文本中的命名实体进行了识别,系统中结合了丰富的特征集,包括局部特征,全文特征和外部资源特征,对不同的特征和不同的特征组合对系统的贡献进行了评测和实验.为了进一步提高系统的性能,还引入了缩写词识别模块和过滤器模块.实验结果表明,该方法对生物医学文本中命名实体的识别取得到了较好的结果. Name entity recognition is a fundamental task in biomedical data mining. This paper presents a Support Vector Machine-based method to identify name entity in biomedical texts. In this study, a rich set of features, including local features, whole text features, and external resource features are used. And different features and combinations of features are evaluated. Moreover, the abbreviation recognition module and filter module are introduced to improve performance of the system. The experimental results show that the system has better performance.
出处 《哈尔滨工程大学学报》 EI CAS CSCD 北大核心 2006年第B07期570-574,共5页 Journal of Harbin Engineering University
基金 国家863计划基金资助项目(2004AA117010-08).
关键词 命名实体识别 SVM 特征选择 缩写词 name entityrecognitiom SVM feature selection abbreviation
  • 相关文献

参考文献12

  • 1FRANZEN K,ERIKSSON G,OLSSON F,et al.Protein names and how to find them[J].Int J Med Inf,2002,67:49-61
  • 2FUKUDA K,TAMURA A,TSUNODA T,et al.Toward information extraction:identifying protein names from biological papers[A].In Proceedings of Pacific Symposium on Biocomputing'98[C].Maui,Hawaii,1998.
  • 3ZHOU G,ZHANG J,SU J,et al.Recognizing names in biomedical texts:a machine learning approach[J].Bioinformatics,2004,20(7):1178-1190.
  • 4KAZUHIRO Seki,JAVED Mostafa.A Probabilistic Model for Identifying Protein Names and their Name Boundaries[A].Proceedings of the Computational Systems Bioinformatics[C],Stanford,CA,2003.
  • 5YOSHIMASA Tsuruoka,YUKA Tateishi,KIM Jin-Dong,et al.Developing a robust part-of-speech tagger for biomedical text[A].Advances in Informatics -10th Panhellenic Conference on Informatics[C].[s.l.]2005
  • 6KULICK S,BIES A,LIBERMAN M,et al.Integrated annotation for biomedical information extraction[A].HLT/NAACL 2004 Workshop:BioLink[C].Boston,Massachusetts,2004.
  • 7MIKA S R.Protein names peeled precisely off free text[J].Bioinformatics,2004,20:241-247.
  • 8SCHWARTZ AS,HEARST MA.A simple algorithm for identifying abbreviation definitions in biomedical text[J].Pac Symp Biocomput,2003,8:451-462.
  • 9KIM Jin-Dong,OHTA Tomoko,TSURUOKA Yoshimasa,et al.Introduction to the bio-entity recognition task at JNLPBA[A].Proceedings of the Joint Workshop on Natural Language Processing in Biomedicine and its Applications(JNLPBA-2004)[C].Geneva,Switzerland,2004.
  • 10ZHOU Guodong,SU Jian.Exploring deep knowledge resources in biomedical name recognition[A].Proceedings of the Joint Workshop on Natural Language Processing in Biomedicine and its Applications(JNLPBA-2004)[C].Geneva,Switzerland,2004.

同被引文献278

引证文献18

二级引证文献124

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部