期刊文献+

融合浅层句法分析的蛋白质互作用信息抽取方法 被引量:2

Protein-protein interaction extraction method using shallow parsing
下载PDF
导出
摘要 针对传统基于机器学习方法在蛋白质互作用信息抽取中的缺陷,提出融合浅层句法分析的信息抽取方法,该方法将候选的句子进行浅层句法分析,包括对短语切分、同位语分析、并列结构分析、句子切分的处理。经过该步骤,句子被划分为多个单独的语法单元。然后,对每个语法单元采用基于最大熵的分类方法进行蛋白质互作用信息抽取。该方法在BC-PPI语料库中获得了62.1%的F1性能。比较实验结果表明,该方法能有效减少误判和漏判,提高信息抽取的性能。 In order to solve problems of protein-protein interaction extraction based on traditional machine learning methods,this paper proposed an information extraction method using shallow parsing.This method first processed candidate sentences by shallow parsing including phrase chunking,appositive parsing,coordinative parsing and sentence splitting.After this step,divided sentences into multiple individual grammar units.Secondly,extracted protein-protein interactions from each unit using maximum entropy classification method.Tested in the BC-PPI corpus,this method achieved F1 value of 62.1%.Comparative experiments show the method decreases false positives and false negatives efficiently and improves performances of information extraction.
出处 《计算机应用研究》 CSCD 北大核心 2011年第3期972-975,共4页 Application Research of Computers
基金 国家高科技发展规划项目(2006AA01Z411)
关键词 蛋白质互作用 信息抽取 浅层句法分析 最大熵 protein-protein interaction(PPI) information extraction shallow parsing maximum entropy
  • 相关文献

参考文献9

  • 1YAKUSHIJI A,TATEISI Y,MIYAO Y,et al.Event extraction from biomedical papers using a full parser[C] //Proc of the 6th Pacific Symposium on Biocomputing.2001:408-419.
  • 2HUANG Min-lie,ZHU Xiao-yan,YU Hao.Discovering patterns to extract protein-protein interactions from full biomedical texts[J].Bioinformatics,2004,20 (18):3604-3612.
  • 3MITSUMORI T,MURATA M,FUKUDA Y.Extracting protein-protein interaction information from biomedical text with SVM[J].IEICE Trans on Information and Systems,2006,E89-D (8):2464-2466.
  • 4SMITH L,RINDFLESCH T,WILBUR W J.MedPost:a part-of-speech tagger for biomedical text[J].Bioinformatics,2004,20 (14):2320-2321.
  • 5RAMSHAW L A,MARCUS M P.Text chunking using transformation-based learning[C] //Proc of ACL the 3rd Workshop on Very Large Corpora.1995:82-94.
  • 6孔芳,周国栋,朱巧明,钱培德.指代消解综述[J].计算机工程,2010,36(8):33-36. 被引量:13
  • 7XIAO Juan,SU Jian,ZHOU Guo-dong.Protain-protein interaction extraction:a supervised learning approach[C] //Proc of the ist International Symposium on Semantic Mining in Biomedicine.2005:51-59.
  • 8PLAKE C,HAKENBERG J,LESER U.Optimizing syntax-patterns for diacovering protein-protein interactions[C] //Proc of ACM Symposium oh Applied Computing.Santa Fe:ACM,2005:195-201.
  • 9NIELSEH L A.Extracting prrotein-protein interactions using simple contextual features[C] //Proc of BioNLP Workshop.Morristown:Association for Computational Linguistics,2006:120-121.

二级参考文献6

  • 1Soon W M,Ng H T,Lim C Y.A Machine Learning Approach to Coreference Resolution of Noun Phrases[J].Computational Linguistics,2001,27(4):521-544.
  • 2Yang Xiaofeng,Zhou Guodong,Su Jian.Coreference Resolution Using Competition Learning Approach[C]//Proc.of ACL'03.Sapporo,Japan:[s.n.],2003.
  • 3Yang Xiaofeng,Su Jian,Tan Chew Lim.Kernel-based Pronoun Resolution with Structured Syntactic Knowledge[C]//Proc.of ACL'06.Sydney,Australia:[s.n.],2006.
  • 4Zhou Guodong,Kong Fang,Zhu Qiaoming.Context-sensitive Convolution Tree Kernel for Pronoun Resolution[C]//Proc.of IJCNLP'08.Hyderabad,India:[s.n.],2008.
  • 5Vincent N G.Semantic Class Induction and Coreference Resolution[C]//Proc.of ACL'07.Prague,Czech Republic:[s.n.],2007.
  • 6王厚峰,何婷婷.汉语中人称代词的消解研究[J].计算机学报,2001,24(2):136-143. 被引量:36

共引文献12

同被引文献25

  • 1朱大明.学术论文引言中的参考文献简析[J].编辑学报,2005,17(3):190-191. 被引量:36
  • 2温有奎 ,温浩 ,徐端颐 ,潘龙法 .基于创新点的知识元挖掘[J].情报学报,2005,24(6):663-668. 被引量:37
  • 3高时阔,黎文丽,郭开选,赵青娥.科技论文文体结构所体现的美学特征[J].编辑学报,2006,18(3):173-175. 被引量:9
  • 4温有奎,温浩.关键词与创新点词句群分布分析[J].情报学报,2007,26(1):50-55. 被引量:8
  • 5GRISHMAN R. Information extraction: techniques and challenges[ R]. New York: New York University Press, 1997.
  • 6FRIJTERS R, VAN VUGT M, SMEETS R, et al. Literature mining for the discovery of hidden connections between drugs, genes and diseases [ J]. Los Computational Biology, 2010, 6 (9) :e1000943.
  • 7KIM J D, NGUYEN N, WANG Yue, et al. The genia event and protein coreference tasks of the BioNLP shared task 2011E J]. BMC Bioinformatics, 2012, 13 (11) :S1. 1-S1.12.
  • 8GARTEN Y, COULET A, ALTMAN R B. Recent progress in automatically extracting information from the pharmacogenomic literature [ J ]. Pharmacogenomics, 2010, 11 (10) : 1467-1489.
  • 9Chikashi Nobata, Paul D Dobson, Syed A Iqbal, et al. Mining metabolites: extracting the yeast metabolome from the literature[J ]. Metabolomics, 2011, 7 ( 1 ) : 94-101.
  • 10PITIER E, RAGHUPATHY M, MEHTA H, et al. Easily identifiable discourse relations [ C]// Proceedings of COLING.[ S. ]. I : DBLP, 2008 : 87-90.

引证文献2

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部