Automatic partition of Chinese sentence group is very important to the statistical machine translation system based on discourse. This paper presents an approach to this issue: first, each sentence in a discourse is ...Automatic partition of Chinese sentence group is very important to the statistical machine translation system based on discourse. This paper presents an approach to this issue: first, each sentence in a discourse is expressed as a feature vector; second, a special hierarchical clustering algorithm is applied to present a discourse as a sentence group tree. In this paper, local reoccurrence measure is proposed to the selection of key phras and the evaluation of the weight of key phrases. Experimental results show our approach promising.展开更多
N+N nominal sentence is an important structure type of nominal sentences in Mandarin Chinese. Attributive-center, combination, apposition and subject-predicate are its main structure types. In main literary genres, ...N+N nominal sentence is an important structure type of nominal sentences in Mandarin Chinese. Attributive-center, combination, apposition and subject-predicate are its main structure types. In main literary genres, the distribution of N+N nominal sentence shows a certain trend of dominant hierarchy: poem﹥drama﹥novel﹥prose. No matter what kind of literary genres, attributive-center structure is the type with maximum quantity, while appositive structure is the type with minimum quantity. Statistical result indicates that most of N+N nominal sentence is nominal and its use is limited by genres. Function of N+N nominal sentence is textual. When it comes to discourse, it can be used as theme, rheme and dual identity of theme and rheme based on the theory of Theme-Rheme (T-R) structure pattern. It does not only construct the information structure to deliver textual information, but also its a vital means of discourse cohesion and coherence.展开更多
基金National High Technology Research and Development Program of China ( No.2006AA01Z139)Young NaturalScience Foundation of Fujian Province of China ( No.2008F3105)+1 种基金Natural Science Foundation of Fujian Province of China ( No.2006J0043)Fund of Key Research Project of Fujian Province of China (No.2006H0038)
文摘Automatic partition of Chinese sentence group is very important to the statistical machine translation system based on discourse. This paper presents an approach to this issue: first, each sentence in a discourse is expressed as a feature vector; second, a special hierarchical clustering algorithm is applied to present a discourse as a sentence group tree. In this paper, local reoccurrence measure is proposed to the selection of key phras and the evaluation of the weight of key phrases. Experimental results show our approach promising.
文摘N+N nominal sentence is an important structure type of nominal sentences in Mandarin Chinese. Attributive-center, combination, apposition and subject-predicate are its main structure types. In main literary genres, the distribution of N+N nominal sentence shows a certain trend of dominant hierarchy: poem﹥drama﹥novel﹥prose. No matter what kind of literary genres, attributive-center structure is the type with maximum quantity, while appositive structure is the type with minimum quantity. Statistical result indicates that most of N+N nominal sentence is nominal and its use is limited by genres. Function of N+N nominal sentence is textual. When it comes to discourse, it can be used as theme, rheme and dual identity of theme and rheme based on the theory of Theme-Rheme (T-R) structure pattern. It does not only construct the information structure to deliver textual information, but also its a vital means of discourse cohesion and coherence.