Grammar learning has been a bottleneck problem for a long time. In this paper, we propose a method of seman- tic separator learning, a special case of grammar learning. The method is based on the hypothesis that some ...Grammar learning has been a bottleneck problem for a long time. In this paper, we propose a method of seman- tic separator learning, a special case of grammar learning. The method is based on the hypothesis that some classes of words, called semantic separators, split a sentence into sev- eral constituents. The semantic separators are represented by words together with their part-of-speech tags and other infor- mation so that rich semantic information can be involved. In the method, we first identify the semantic separators with the help of noun phrase boundaries, called subseparators. Next, the argument classes of the separators are learned from cor- pus by generalizing argument instances in a hypernym space. Finally, in order to evaluate the learned semantic separators, we use them in unsupervised Chinese text parsing. The exper- iments on a manually labeled test set show that the proposed method outperforms previous methods of unsupervised text parsing.展开更多
文摘Grammar learning has been a bottleneck problem for a long time. In this paper, we propose a method of seman- tic separator learning, a special case of grammar learning. The method is based on the hypothesis that some classes of words, called semantic separators, split a sentence into sev- eral constituents. The semantic separators are represented by words together with their part-of-speech tags and other infor- mation so that rich semantic information can be involved. In the method, we first identify the semantic separators with the help of noun phrase boundaries, called subseparators. Next, the argument classes of the separators are learned from cor- pus by generalizing argument instances in a hypernym space. Finally, in order to evaluate the learned semantic separators, we use them in unsupervised Chinese text parsing. The exper- iments on a manually labeled test set show that the proposed method outperforms previous methods of unsupervised text parsing.