Translation lexicons are fundamental to natural language processing tasks like machine translation and cross language information retrieval. This paper presents a lexicon builder that can auto extract (or assist lexic...Translation lexicons are fundamental to natural language processing tasks like machine translation and cross language information retrieval. This paper presents a lexicon builder that can auto extract (or assist lexicographer in compiling) the word translations from Chinese English parallel corpus. Key mechanisms in this builder system are further described, including co occurrence measure, indirection association resolution and multi word unit translation. Experiment results indicate the effectiveness of the authors’ method and the potentiality of the lexicon builder system.展开更多
Identifying negative or speculative narrative frag- ments from facts is crucial for deep understanding on natu- ral language processing (NLP). In this paper, we firstly con- struct a Chinese corpus which consists of...Identifying negative or speculative narrative frag- ments from facts is crucial for deep understanding on natu- ral language processing (NLP). In this paper, we firstly con- struct a Chinese corpus which consists of three sub-corpora from different resources. We also present a general framework for Chinese negation and speculation identification. In our method, first, we propose a feature-based sequence labeling model to detect the negative or speculative cues. In addition, a cross-lingual cue expansion strategy is proposed to increase the coverage in cue detection. On this basis, this paper presents a new syntactic structure-based framework to identify the linguistic scope of a negative or speculative cue, instead of the traditional chunking-based framework. Experimental results justify the usefulness of our Chinese corpus and the appropriateness of our syntactic structure-based framework which has showed significant improvement over the state-of-the-art on Chinese negation and speculation identification.展开更多
文摘Translation lexicons are fundamental to natural language processing tasks like machine translation and cross language information retrieval. This paper presents a lexicon builder that can auto extract (or assist lexicographer in compiling) the word translations from Chinese English parallel corpus. Key mechanisms in this builder system are further described, including co occurrence measure, indirection association resolution and multi word unit translation. Experiment results indicate the effectiveness of the authors’ method and the potentiality of the lexicon builder system.
基金This research was supported by the National Natural Science Foundation of China (Grant Nos. 61373097, 61272259 and 61272260). Special thanks to Zhancheng Chen, Zhong Qian, and the anonymous reviewers for insightful comments and suggestions.
文摘Identifying negative or speculative narrative frag- ments from facts is crucial for deep understanding on natu- ral language processing (NLP). In this paper, we firstly con- struct a Chinese corpus which consists of three sub-corpora from different resources. We also present a general framework for Chinese negation and speculation identification. In our method, first, we propose a feature-based sequence labeling model to detect the negative or speculative cues. In addition, a cross-lingual cue expansion strategy is proposed to increase the coverage in cue detection. On this basis, this paper presents a new syntactic structure-based framework to identify the linguistic scope of a negative or speculative cue, instead of the traditional chunking-based framework. Experimental results justify the usefulness of our Chinese corpus and the appropriateness of our syntactic structure-based framework which has showed significant improvement over the state-of-the-art on Chinese negation and speculation identification.