期刊文献+

Chinese Word Segmentation via BiLSTM+Semi-CRF with Relay Node 被引量:2

原文传递
导出
摘要 Semi-Markov conditional random fields(Semi-CRFs)have been successfully utilized in many segmentation problems,including Chinese word segmentation(CWS).The advantage of Semi-CRF lies in its inherent ability to exploit properties of segments instead of individual elements of sequences.Despite its theoretical advantage,Semi-CRF is still not the best choice for CWS because its computation complexity is quadratic to the sentenced length.In this paper,we propose a simple yet effective framework to help Semi-CRF achieve comparable performance with CRF-based models under similar computation complexity.Specifically,we first adopt a bi-directional long short-term memory(BiLSTM)on character level to model the context information,and then use simple but effective fusion layer to represent the segment information.Besides,to model arbitrarily long segments within linear time complexity,we also propose a new model named Semi-CRF-Relay.The direct modeling of segments makes the combination with word features easy and the CWS performance can be enhanced merely by adding publicly available pre-trained word embeddings.Experiments on four popular CWS datasets show the effectiveness of our proposed methods.The source codes and pre-trained embeddings of this paper are available on https://github.com/fastnlp/fastNLP/.
出处 《Journal of Computer Science & Technology》 SCIE EI CSCD 2020年第5期1115-1126,共12页 计算机科学技术学报(英文版)
基金 supported by the National Natural Science Foundation of China under Grant Nos.61751201 arid 61672162 the Shanghai Municipal Science and Technology Major Project under Grant Nos.2018SHZDZX01 and ZJLab.
  • 相关文献

参考文献1

二级参考文献2

共引文献10

同被引文献10

引证文献2

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部