The paper proposes a unified framework to combine the advantages of the fast one-at-a-time approach and the high-performance all-at-once approach to perform Chinese Word Segmentation(CWS) and Part-of-Speech(PoS) taggi...The paper proposes a unified framework to combine the advantages of the fast one-at-a-time approach and the high-performance all-at-once approach to perform Chinese Word Segmentation(CWS) and Part-of-Speech(PoS) tagging.In this framework,the input of the PoS tagger is a candidate set of several CWS results provided by the CWS model.The widely used one-at-a-time approach and all-at-once approach are two extreme cases of the proposed candidate-based approaches.Experiments on Penn Chinese Treebank 5 and Tsinghua Chinese Treebank show that the generalized candidate-based approach outperforms one-at-a-time approach and even the all-at-once approach.The candidate-based approach is also faster than the time-consuming all-at-once approach.The authors compare three different methods based on sentence,words and character-intervals to generate the candidate set.It turns out that the word-based method has the best performance.展开更多
The acquisition of word knowledge, in terms of breadth and depth, has a fundamental impact on language learning. A comprehensive framework of word knowledge is therefore needed to help language learners master all the...The acquisition of word knowledge, in terms of breadth and depth, has a fundamental impact on language learning. A comprehensive framework of word knowledge is therefore needed to help language learners master all the aspects of word knowledge. This paper proposed a framework of word knowledge based on the literature review of the field. The pedagogical implications of the word knowledge framework were then proposed and discussed critically. Finally, it is proposed that syllabus designers need to consider the possibility of incorporating "word knowledge framework" into syllabus.展开更多
基金supported by the National Natural Science Foundation of China under GrantNo.60873174
文摘The paper proposes a unified framework to combine the advantages of the fast one-at-a-time approach and the high-performance all-at-once approach to perform Chinese Word Segmentation(CWS) and Part-of-Speech(PoS) tagging.In this framework,the input of the PoS tagger is a candidate set of several CWS results provided by the CWS model.The widely used one-at-a-time approach and all-at-once approach are two extreme cases of the proposed candidate-based approaches.Experiments on Penn Chinese Treebank 5 and Tsinghua Chinese Treebank show that the generalized candidate-based approach outperforms one-at-a-time approach and even the all-at-once approach.The candidate-based approach is also faster than the time-consuming all-at-once approach.The authors compare three different methods based on sentence,words and character-intervals to generate the candidate set.It turns out that the word-based method has the best performance.
文摘The acquisition of word knowledge, in terms of breadth and depth, has a fundamental impact on language learning. A comprehensive framework of word knowledge is therefore needed to help language learners master all the aspects of word knowledge. This paper proposed a framework of word knowledge based on the literature review of the field. The pedagogical implications of the word knowledge framework were then proposed and discussed critically. Finally, it is proposed that syllabus designers need to consider the possibility of incorporating "word knowledge framework" into syllabus.