Automatic partition of Chinese sentence group is very important to the statistical machine translation system based on discourse. This paper presents an approach to this issue: first, each sentence in a discourse is ...Automatic partition of Chinese sentence group is very important to the statistical machine translation system based on discourse. This paper presents an approach to this issue: first, each sentence in a discourse is expressed as a feature vector; second, a special hierarchical clustering algorithm is applied to present a discourse as a sentence group tree. In this paper, local reoccurrence measure is proposed to the selection of key phras and the evaluation of the weight of key phrases. Experimental results show our approach promising.展开更多
基金National High Technology Research and Development Program of China ( No.2006AA01Z139)Young NaturalScience Foundation of Fujian Province of China ( No.2008F3105)+1 种基金Natural Science Foundation of Fujian Province of China ( No.2006J0043)Fund of Key Research Project of Fujian Province of China (No.2006H0038)
文摘Automatic partition of Chinese sentence group is very important to the statistical machine translation system based on discourse. This paper presents an approach to this issue: first, each sentence in a discourse is expressed as a feature vector; second, a special hierarchical clustering algorithm is applied to present a discourse as a sentence group tree. In this paper, local reoccurrence measure is proposed to the selection of key phras and the evaluation of the weight of key phrases. Experimental results show our approach promising.