语义相似度的基因名标准化方法被引量：1

Gene name normalization based on extended semantic similarity

下载PDF

导出

摘要针对生物医学数据库中基因标识符的描述信息不够丰富和完整,不能很好地区分歧义词不同含义的问题,给出了一种基于扩展语义相似度的基因名标准化方法。该方法利用MEDLINE摘要信息和基因本体描述信息,为数据库中的基因标识符生成了扩展的语义信息;然后通过比较歧义基因名的上下文信息和其不同语义描述信息之间的相似性,为歧义基因名确定能够表达真实含义的唯一基因标识符。使用BioCreativeII基因标准化任务的语料,实验结果的准确率达到了80%,召回率达到了82.4%,F值达到了81.2%。从实验结果可以看出,扩展语义相似度的方法适用于生物医学领域的命名实体标准化研究。 In this paper,a normalization method based on extended semantic similarity is presented to resolve the problem that description of gene symbols in biomedical databases is not rich and complete so that it is hard to make a choice from different gene symbols for the ambiguous term.In this method, extended semantic information is extracted for each gene symbol from gene ontology and MEDLINE abstracts, and the unique identifier which expresses the actual meaning of the named entities is determined depending on the similarity of the context information and extended semantic description.The experiment on Bio- Creative II gene normalization task achieves an F-measure performance of 81.2%（precision： 80% recall： 82.4%）.The experimental result shows that the method based on extended semantic similarity can apply to gene named entities normalization.

作者胡运翠林鸿飞杨志豪

机构地区大连理工大学电子信息与电气工程学部

出处《计算机工程与应用》 CSCD 北大核心 2011年第35期128-131,共4页 Computer Engineering and Applications

基金国家自然科学基金(No.60673039 No.60973068) 国家高技术研究发展计划(863)(No.2006AA01Z151) 国家社科基金(No.08BTQ025) 教育部留学回国人员科研启动基金高等学校博士学科点专项科研基金资助课题(No.20090041110002)~~

关键词基因标准化扩展语义相似度消歧 gene normalization extended semantic similarity disambiguation

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献14

1Morgan A A, Lu Zhiyong, Wang Xinglong, et al.Overview of BioCreative II gene normalization[J].Genome Biology, 2008,9 (2) : 3.
2Chen L,Liu H,Friedman C.Gene name ambiguity of eukaryotic nomenclatures[J].Bioinformatics, 2005,21 (2) : 248-256.
3Carpenter R.Phrasal queries with LingPipe and lucene:Ad hoc genomics text retrieval[C]//Proceedings of the 13th Annual Text Retrieval Conference,Gaithersburg,2004.
4Settles B.ABNER: an open source tool for automatically tagging genes,proteins and other entity names in text[J].Bioinformatics, 2005,21(14):3191-3192.
5Xu H,Fan J W,Hripcsak G, et al.Gene symbol disarnbiguation using knowledge-based profiles[J].Bioinformaties, 2007, 23 (8) : 1015-1022.
6Mariana L, Jose-Maria C,Alberto EMoara:a Java library for extracting and normalizing gene and protein mentions[J].BMC Bioinformatics, 2010,11 ( 1 ) : 157.
7Joachim W, Katrin T, Udo H.High-performance gene name normalization with GeNo[J].Bioinformatics, 2009,25 (6) : 815-821.
8Fang H R,Murphy K,Jin Y, et al.Human gene name normalization using text matching with automatically extracted synonym dictionaries[C]//Proceedings of the BioNLP Workshop on Linking Natural Language Processing and Biology Association for Computational Linguistics,New York,USA,2006:41-48.
9Schuernie M J, Jelicr R, Kors J A.Peregrine:lightweight gene name normalization by dictionary lookup[C]//Proc of the Second BioCreative Challenge Evaluation Workshop Madrid, Spain, 2007: 131-133.
10Hakenberg J,Plake C,Leaman R, et al.lnter-species normalization of gene mentions with GNAT[J].Bioinformatics,2008,24(16) : 126-132.

同被引文献2

1QIN Ying,ZENG Yingfei.Research of Clinical Named Entity Recognition Based on Bi-LSTM-CRF[J].Journal of Shanghai Jiaotong university(Science),2018,23(3):392-397. 被引量：15
2杨培,杨志豪,罗凌,林鸿飞,王健.基于注意机制的化学药物命名实体识别[J].计算机研究与发展,2018,55(7):1548-1556. 被引量：41

引证文献1

1匡泽民,李健铨,邓楠.信息抽取在构建医学知识图谱中的应用及进展[J].医学信息学杂志,2021,42(1):29-35. 被引量：3

二级引证文献3

1赵婉婷,黄浩宸,秦新祚,郭昫,宋珏娴.中西医结合卒中知识图谱的构建与应用的研究进展[J].北京医学,2023,45(2):143-146. 被引量：3
2靳淑雁,王爽,黄琼,邱五七,林怿昊.基于乳腺癌专病库的知识图谱构建研究[J].医学信息学杂志,2023,44(12):65-70. 被引量：2
3魏鹏,周冰原,吴安琪,钱刚,戎海琴.基于Neo4j的中风病知识图谱构建思路与应用探析[J].中医临床研究,2024,16(22):1-8.

1邹睿,欧阳楷,刘悦.神经网络中的微心理学——兼论人工神经网络框架[J].生物医学工程研究,1998,19(1):40-44.
2口语中的“热”词HOT[J].中学英语之友（新教材初二版）,2011(8):27-27.
3曾琦,周刚,兰明敬,王濛.一种多义词词向量计算方法[J].小型微型计算机系统,2016,37(7):1417-1421. 被引量：7
4每个手指戴戒指的不同含义[J].河南科技（乡村版）,2009(9):43-43.
5牟春华.如何判别入侵检测系统性能[J].华南金融电脑,2005,13(8):51-53.
6韩林.UPS选型中应予关注的问题——试论“平均无故障时间”的真实含义[J].电子与金系列工程信息,1999(11):24-26.
7韩林.UPS选型中应予关注的问题(二)——试论“平均无故障时间”的真实含义[J].信息系统工程,1999(11):58-58. 被引量：1
8英国科学家发明智能电视能“听懂”语音指令[J].创新科技,2011(8):31-31.
9ST.文字：“迷宫”组曲[J].电脑爱好者,2008,0(6):91-91.
10匈牙利科学家开发出狗吠翻译机[J].奇闻怪事,2008(4):49-49.

计算机工程与应用

2011年第35期

浏览历史

内容加载中请稍等...

语义相似度的基因名标准化方法被引量：1

参考文献14

同被引文献2

引证文献1

二级引证文献3

相关作者

相关机构

相关主题

浏览历史

语义相似度的基因名标准化方法 被引量：1

参考文献14

同被引文献2

引证文献1

二级引证文献3

相关作者

相关机构

相关主题

浏览历史

语义相似度的基因名标准化方法被引量：1