期刊文献+

语义相似度的基因名标准化方法 被引量:1

Gene name normalization based on extended semantic similarity
下载PDF
导出
摘要 针对生物医学数据库中基因标识符的描述信息不够丰富和完整,不能很好地区分歧义词不同含义的问题,给出了一种基于扩展语义相似度的基因名标准化方法。该方法利用MEDLINE摘要信息和基因本体描述信息,为数据库中的基因标识符生成了扩展的语义信息;然后通过比较歧义基因名的上下文信息和其不同语义描述信息之间的相似性,为歧义基因名确定能够表达真实含义的唯一基因标识符。使用BioCreativeII基因标准化任务的语料,实验结果的准确率达到了80%,召回率达到了82.4%,F值达到了81.2%。从实验结果可以看出,扩展语义相似度的方法适用于生物医学领域的命名实体标准化研究。 In this paper,a normalization method based on extended semantic similarity is presented to resolve the problem that description of gene symbols in biomedical databases is not rich and complete so that it is hard to make a choice from different gene symbols for the ambiguous term.In this method, extended semantic information is extracted for each gene symbol from gene ontology and MEDLINE abstracts, and the unique identifier which expresses the actual meaning of the named entities is determined depending on the similarity of the context information and extended semantic description.The experiment on Bio- Creative II gene normalization task achieves an F-measure performance of 81.2%(precision: 80% recall: 82.4%).The experimental result shows that the method based on extended semantic similarity can apply to gene named entities normalization.
出处 《计算机工程与应用》 CSCD 北大核心 2011年第35期128-131,共4页 Computer Engineering and Applications
基金 国家自然科学基金(No.60673039 No.60973068) 国家高技术研究发展计划(863)(No.2006AA01Z151) 国家社科基金(No.08BTQ025) 教育部留学回国人员科研启动基金 高等学校博士学科点专项科研基金资助课题(No.20090041110002)~~
关键词 基因 标准化 扩展语义相似度 消歧 gene normalization extended semantic similarity disambiguation
  • 相关文献

参考文献14

  • 1Morgan A A, Lu Zhiyong, Wang Xinglong, et al.Overview of BioCreative II gene normalization[J].Genome Biology, 2008,9 (2) : 3.
  • 2Chen L,Liu H,Friedman C.Gene name ambiguity of eukaryotic nomenclatures[J].Bioinformatics, 2005,21 (2) : 248-256.
  • 3Carpenter R.Phrasal queries with LingPipe and lucene:Ad hoc genomics text retrieval[C]//Proceedings of the 13th Annual Text Retrieval Conference,Gaithersburg,2004.
  • 4Settles B.ABNER: an open source tool for automatically tagging genes,proteins and other entity names in text[J].Bioinformatics, 2005,21(14):3191-3192.
  • 5Xu H,Fan J W,Hripcsak G, et al.Gene symbol disarnbiguation using knowledge-based profiles[J].Bioinformaties, 2007, 23 (8) : 1015-1022.
  • 6Mariana L, Jose-Maria C,Alberto EMoara:a Java library for extracting and normalizing gene and protein mentions[J].BMC Bioinformatics, 2010,11 ( 1 ) : 157.
  • 7Joachim W, Katrin T, Udo H.High-performance gene name normalization with GeNo[J].Bioinformatics, 2009,25 (6) : 815-821.
  • 8Fang H R,Murphy K,Jin Y, et al.Human gene name normalization using text matching with automatically extracted synonym dictionaries[C]//Proceedings of the BioNLP Workshop on Linking Natural Language Processing and Biology Association for Computational Linguistics,New York,USA,2006:41-48.
  • 9Schuernie M J, Jelicr R, Kors J A.Peregrine:lightweight gene name normalization by dictionary lookup[C]//Proc of the Second BioCreative Challenge Evaluation Workshop Madrid, Spain, 2007: 131-133.
  • 10Hakenberg J,Plake C,Leaman R, et al.lnter-species normalization of gene mentions with GNAT[J].Bioinformatics,2008,24(16) : 126-132.

同被引文献2

引证文献1

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部