期刊文献+

Scaling Conditional Random Fields by One-Against-the-Other Decomposition 被引量:1

Scaling Conditional Random Fields by One-Against-the-Other Decomposition
原文传递
导出
摘要 As a powerful sequence labeling model, conditional random fields (CRFs) have had successful applications in many natural language processing (NLP) tasks. However, the high complexity of CRFs training only allows a very small tag (or label) set, because the training becomes intractable as the tag set enlarges. This paper proposes an improved decomposed training and joint decoding algorithm for CRF learning. Instead of training a single CRF model for all tags, it trains a binary sub-CRF independently for each tag. An optimal tag sequence is then produced by a joint decoding algorithm based on the probabilistic output of all sub-CRFs involved. To test its effectiveness, we apply this approach to tackling Chinese word segmentation (CWS) as a sequence labeling problem. Our evaluation shows that it can reduce the computational cost of this language processing task by 40-50% without any significant performance loss on various large-scale data sets. As a powerful sequence labeling model, conditional random fields (CRFs) have had successful applications in many natural language processing (NLP) tasks. However, the high complexity of CRFs training only allows a very small tag (or label) set, because the training becomes intractable as the tag set enlarges. This paper proposes an improved decomposed training and joint decoding algorithm for CRF learning. Instead of training a single CRF model for all tags, it trains a binary sub-CRF independently for each tag. An optimal tag sequence is then produced by a joint decoding algorithm based on the probabilistic output of all sub-CRFs involved. To test its effectiveness, we apply this approach to tackling Chinese word segmentation (CWS) as a sequence labeling problem. Our evaluation shows that it can reduce the computational cost of this language processing task by 40-50% without any significant performance loss on various large-scale data sets.
作者 赵海 揭春雨
机构地区 Department of Chinese
出处 《Journal of Computer Science & Technology》 SCIE EI CSCD 2008年第4期612-619,共8页 计算机科学技术学报(英文版)
基金 the Research Grants Council of Hong Kong S.A.R.,China,through the CERG under Grant No.9040861(CityU 1318/03H) City University of Hong Kong through the Strategic Research under Grant No.7002037.
关键词 natural language processing machine learning conditional random fields Chinese word segmentation natural language processing, machine learning, conditional random fields, Chinese word segmentation
  • 相关文献

参考文献24

  • 1Lafferty J D, McCallum A, Pereira F C N. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proc. the Eighteenth International Conference on Machine Learning, ICML'01, Williams College:Morgan Kaufmann Publishers Inc., USA, 2001, pp.282-289.
  • 2Rosenfeld B, Feldman R, Fresko M. A systematic cross-comparison of sequence classifiers. In Proc. SDM 2006, Bethesda, Maryland, 2006, pp.563-567.
  • 3Sha F, Pereira F. Shallow parsing with conditional random fields. In Proc. the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, Edmonton, Canada, Vol. 1, 2003, pp.134-141.
  • 4Wallach H M. Efficient training of conditional random fields [Thesis]. Division of Informatics, University of Edinburgh, 2002.
  • 5Viterbi A J. Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Transactions on Information Theory, 1967, 13(2): 260-269.
  • 6Cohn T, Smith A, Osborne M. Scaling conditional random fields using error-correcting codes. In Proc. the 43rd Annual Meeting of the Association for Computational Linguistics (ACL'05), Ann Arbor, Michigan: Association for Computational Linguistics, June 2005, pp.10-17.
  • 7Hsu C W, Lin C J. A comparison of methods for multi-class support vector machines. IEEE Transactions on Neural Networks, 2002, 13(2): 415-425.
  • 8Sutton C, McCallum A. Piecewise pseudolikelihood for efficient training of conditional random fields. In Proc. the 24th International Conference on Machine Learning, Corvalis, Oregon, ACM Press, June 20-24 2007, pp.863-870.
  • 9Toutanova K, Klein D, Manning C, Singer Y. Feature-rich part-of-speech tagging with a cyclic dependency network. In Proc. HLT-NAACL'03, Edmonton, Canada, May 27 June 1, 2003, pp.252-259.
  • 10V Punyakanok, D Roth, W tau Yih, D Zimak. Learning and inference over constrained output. In Proc. IJCAI 2005, Edinburgh, Scotland, July 30 August 5, 2005, pp.1124-1129.

同被引文献1

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部