期刊文献+

基于半监督学习的蛋白质关系抽取研究 被引量:12

Research of protein-protein interaction extraction based on semi-supervised learning
原文传递
导出
摘要 使用半监督学习方法中的自训练、协同训练方法,利用少量已标注样本和大量未标注样本来完成蛋白质关系抽取的任务.首先使用基于词特征的SVM(support vector machine)模型进行自训练,然后使用基于词特征的SVM模型和基于依存树特征的SVM模型进行协同训练.通过对4个语料的实验,验证了自训练及协同训练方法在蛋白质关系抽取领域中的应用效果.相比于自训练,协同训练可以通过两个相对独立的视图相互补充、相互学习,进而可以有效利用未标注数据. Semi-supervised learning methods including self-training and co-training were shown in the task of PPI on how to alleviate the tag burden as much as possible. On self training a word feature based SVM model was applied; In co-training word feature and dependency tree based SVM models were used. Experiments on four copra showed that the two methods were effective in reducing the amount of labeling PPI(protein-protein interaction). Comparing self training to co-training can be more effective by using two separated views.
出处 《山东大学学报(工学版)》 CAS 北大核心 2009年第3期16-21,共6页 Journal of Shandong University(Engineering Science)
基金 国家自然科学基金资助项目(60373095 60673039) 国家863高科技计划资助项目(2006AA01Z151) 教育部留学回国人员科研启动基金资助项目[教外司留(2007)1108]
关键词 关系抽取 半监督学习 词特征 自训练 协同训练 interaction extraction semi-supervised learning word feature self-training co-training
  • 相关文献

参考文献16

  • 1CHAPELLE O, SCHOLKOPF B, ZIEN A.Semi-supervised learning[M]. Cambridge MA: M1T Press, 2006.
  • 2BLUM A, MITCHELL T. Combining labeled and unlabeled data with co-training[C]//Proceedings of the 11th Annual Conference on Computational Learning Theory. New York: ACM Press, 1998: 92-100.
  • 3DEMPSTER A P, LAIRD N M, RUBIN D B. Maximum likelihood from incomplete data via the EM algorithm[J]. Journal of the Royal Statistical Society: Series B, 1977, 39(1):1-38.
  • 4JOACHIMS T. Transductive inference for text classification using support vector machines[C]//Proceedings of the 16th International Conference on Machine Learning. San Fransisco: [s.n.], 1999: 200-209.
  • 5BELKIN M, MATVEEVA I, NIYOGI P. Regression and regularization on large graphs[C]//Proeeodings of the 17th Annual Conference on Learning Theory. New York: ACM Press, 2004: 185-192.
  • 6YAROWSKY D. Unsupervised word sense disambiguation rivaling supervised methods [ C ]// Proceedings of 33rd Annual Meeting of the Association for Computational Linguistics. Cambridge, MA: MIT Press, 1995: 189-196.
  • 7COLLINS M, YORAM S. Unsupervised models for named entity classification[ C]// Proceedings of the Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora. College Park, MD: [s.n. ], 1999.
  • 8NIGAM K, GHANI R. Analyzing the effectiveness and applicability of co-training[ C ]//9th International Conference on Information and Knowledge Management. McLean, Virginia: [ s. n. ], 2000: 86-93.
  • 9ZHOU Zhihua, LIMing. Tri-training: exploiting unlabeled data using three classifiers[ J ]. IEEE Tram on Knowledge and Data Engineering, 2005, 17(11) : 1529-1541.
  • 10EUGENE C. Immediate-head parsing for language models [ C]// Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics. San Francisco: Morgan Kaufmann Publishers, 2001.

同被引文献92

  • 1刘念,马长林,张勇,王梦.基于树核的蛋白质相互作用关系提取的研究[J].华中科技大学学报(自然科学版),2013,41(S2):232-236. 被引量:5
  • 2饶文碧,柯慧燕.Web文本分类技术研究及其实现[J].计算机技术与发展,2006,16(3):116-118. 被引量:5
  • 3姜维,关毅,王晓龙.基于条件随机域的词性标注模型[J].计算机工程与应用,2006,42(21):13-16. 被引量:12
  • 4何婷婷,徐超,李晶,赵君喆.基于种子自扩展的命名实体关系抽取方法[J].计算机工程,2006,32(21):183-184. 被引量:25
  • 5王煜,白石,王正欧.用于Web文本分类的快速KNN算法[J].情报学报,2007,26(1):60-64. 被引量:33
  • 6周志华.半监督学习中的协同训练算法[M]//周志华,王珏.机器学习及其应用.北京:清华大学出版社,2007:259-275.
  • 7ZHU Xiaojin. Semi-supervised learning literature survey [R]. Madison: Computer Sciences Department, University of Wisconsin, 2008.
  • 8FENG Wei, XIE Lei, ZENG Jia, et al. Audio-visual human recognition using semi-supervised spectral learning and hidden Markov models [J]. Journal of Visual Languages and Computing, 2009, 20:188-195.
  • 9BELKIN M, NIYOGI P. Using manifold structure for partially labeled classification [ C ]//Advances in Neural Information Processing Systems ( NIPS 15 ). Cambridge: MIT Press, 2003:926-936.
  • 10ZHU Xiaojin, GHAHRAMANI Z, LAFFERTY J. Semisupervised learning using gaussian fields and harmonic functions[ C ]//Proceedings of the 20th International Conference on Machine Learning. Washington, DC, USA: [ s. n.],2003:58-65.

引证文献12

二级引证文献10

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部