Scaling Conditional Random Fields by One-Against-the-Other Decomposition 被引量：1

Scaling Conditional Random Fields by One-Against-the-Other Decomposition

导出

摘要 As a powerful sequence labeling model, conditional random fields （CRFs） have had successful applications in many natural language processing （NLP） tasks. However, the high complexity of CRFs training only allows a very small tag （or label） set, because the training becomes intractable as the tag set enlarges. This paper proposes an improved decomposed training and joint decoding algorithm for CRF learning. Instead of training a single CRF model for all tags, it trains a binary sub-CRF independently for each tag. An optimal tag sequence is then produced by a joint decoding algorithm based on the probabilistic output of all sub-CRFs involved. To test its effectiveness, we apply this approach to tackling Chinese word segmentation （CWS） as a sequence labeling problem. Our evaluation shows that it can reduce the computational cost of this language processing task by 40-50% without any significant performance loss on various large-scale data sets. As a powerful sequence labeling model, conditional random fields （CRFs） have had successful applications in many natural language processing （NLP） tasks. However, the high complexity of CRFs training only allows a very small tag （or label） set, because the training becomes intractable as the tag set enlarges. This paper proposes an improved decomposed training and joint decoding algorithm for CRF learning. Instead of training a single CRF model for all tags, it trains a binary sub-CRF independently for each tag. An optimal tag sequence is then produced by a joint decoding algorithm based on the probabilistic output of all sub-CRFs involved. To test its effectiveness, we apply this approach to tackling Chinese word segmentation （CWS） as a sequence labeling problem. Our evaluation shows that it can reduce the computational cost of this language processing task by 40-50% without any significant performance loss on various large-scale data sets.

作者赵海揭春雨

机构地区 Department of Chinese

出处《Journal of Computer Science & Technology》 SCIE EI CSCD 2008年第4期612-619,共8页 计算机科学技术学报（英文版）

基金 the Research Grants Council of Hong Kong S.A.R.,China,through the CERG under Grant No.9040861(CityU 1318/03H) City University of Hong Kong through the Strategic Research under Grant No.7002037.

关键词 natural language processing machine learning conditional random fields Chinese word segmentation natural language processing, machine learning, conditional random fields, Chinese word segmentation

分类号 TP391.1 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献24

1Lafferty J D, McCallum A, Pereira F C N. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proc. the Eighteenth International Conference on Machine Learning, ICML'01, Williams College:Morgan Kaufmann Publishers Inc., USA, 2001, pp.282-289.
2Rosenfeld B, Feldman R, Fresko M. A systematic cross-comparison of sequence classifiers. In Proc. SDM 2006, Bethesda, Maryland, 2006, pp.563-567.
3Sha F, Pereira F. Shallow parsing with conditional random fields. In Proc. the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, Edmonton, Canada, Vol. 1, 2003, pp.134-141.
4Wallach H M. Efficient training of conditional random fields [Thesis]. Division of Informatics, University of Edinburgh, 2002.
5Viterbi A J. Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Transactions on Information Theory, 1967, 13(2): 260-269.
6Cohn T, Smith A, Osborne M. Scaling conditional random fields using error-correcting codes. In Proc. the 43rd Annual Meeting of the Association for Computational Linguistics (ACL'05), Ann Arbor, Michigan: Association for Computational Linguistics, June 2005, pp.10-17.
7Hsu C W, Lin C J. A comparison of methods for multi-class support vector machines. IEEE Transactions on Neural Networks, 2002, 13(2): 415-425.
8Sutton C, McCallum A. Piecewise pseudolikelihood for efficient training of conditional random fields. In Proc. the 24th International Conference on Machine Learning, Corvalis, Oregon, ACM Press, June 20-24 2007, pp.863-870.
9Toutanova K, Klein D, Manning C, Singer Y. Feature-rich part-of-speech tagging with a cyclic dependency network. In Proc. HLT-NAACL'03, Edmonton, Canada, May 27 June 1, 2003, pp.252-259.
10V Punyakanok, D Roth, W tau Yih, D Zimak. Learning and inference over constrained output. In Proc. IJCAI 2005, Edinburgh, Scotland, July 30 August 5, 2005, pp.1124-1129.

同被引文献1

1黄健斌,姬红兵,孙鹤立.基于混合跳链条件随机场的异构Web记录集成方法[J].软件学报,2008,19(8):2149-2158. 被引量：8

引证文献1

1丁艳辉,李庆忠,董永权,彭朝晖.2D Correlative-Chain Conditional Random Fields for Semantic Annotation of Web Objects[J].Journal of Computer Science & Technology,2010,25(4):761-770.

1崔怀林,赵树芗.手写汉字识别粗分类方法的研究[J].电子科技大学学报,1996,25(3):311-315. 被引量：5
2李红莲,焦瑞莉,范京.支持向量机多类分类方法的精度分析[J].北京机械工业学院学报,2008,23(2):32-35. 被引量：2
3刘平兰.数字图书馆中基于机器学习的手写汉字识别的研究[J].情报杂志,2004,23(3):45-47.
4英姿.百事得BEST-TOP小天王007机箱[J].电脑自做,2005(8):10-10.
52007:惠普风尚黑晶诠释——Computer is personal again![J].数字生活,2007(1):46-49.
6Read BEIJING REVIEW on the go!Hello, Again[J].Beijing Review,2012,55(23):13-15.
7allen.I robot，again[J].计算机应用文摘,2008,24(15):31-31.
8严考碧,李志欣,张灿龙.基于主题模型的多示例多标记学习方法[J].计算机应用,2015,35(8):2233-2237. 被引量：1
9许颖泉.用神经网络进行数字图象识别研究[J].科技风,2008(24):59-60. 被引量：11
10崔怀林.基于笔划特征的手写汉字分类与识别字典的构造方法[J].模式识别与人工智能,1998,11(2):228-232. 被引量：4

Journal of Computer Science & Technology

2008年第4期

浏览历史

内容加载中请稍等...

Scaling Conditional Random Fields by One-Against-the-Other Decomposition 被引量：1

参考文献24

同被引文献1

引证文献1

相关作者

相关机构

相关主题

浏览历史