The complex sentence structure of English is a bottleneck to our practical machine translation system. The simplification of English subordinate clauses will greatly relieves the burden of parsing and other grammatica...The complex sentence structure of English is a bottleneck to our practical machine translation system. The simplification of English subordinate clauses will greatly relieves the burden of parsing and other grammatical or semantic analysis of a complex sentence, thus improves the output quality of the MT system. But there have not any satisfactory research achievements reported in this field up to now as we know. In this paper, author’s work on a corpus-based approach to English subordinate clause identification is reported. The approach integrates rule-based and statistical methods to get the left and right boundaries of the subordinate clauses. The Penn Treebank corpus is used as the training standard. The precision and recall ratios of subordinate clause identification are tested on both closed and open corpora. A result of 92.9% precision and 91.26% recall is obtained for the closed test and the open test result is 80.34% precision and 83.93% recall. This algorithm has been integrated into our machine translation system. The method can also be applied to processing of any other language.展开更多
This research investigates four categories of high-frequency grammatical errors from 45 students’academic writings via script analysis.A group of 36 students responded to a questionnaire focusing on their beliefs reg...This research investigates four categories of high-frequency grammatical errors from 45 students’academic writings via script analysis.A group of 36 students responded to a questionnaire focusing on their beliefs regarding these grammatical constructions.The participants are enrolled in an international school in Suzhou City.A total of 646 sentences were analyzed,among which295(45.7%)are inaccurate.Of the 364 errors found,most of them(64.0%)fall into the category of Subject-verb agreement,followed by subordinate clauses(28.6%);errors in English existential construction and the passive voice are a few.Also,the usage rate of subordinate clause is 42.7%,of which a large proportion was incorrect.展开更多
文摘The complex sentence structure of English is a bottleneck to our practical machine translation system. The simplification of English subordinate clauses will greatly relieves the burden of parsing and other grammatical or semantic analysis of a complex sentence, thus improves the output quality of the MT system. But there have not any satisfactory research achievements reported in this field up to now as we know. In this paper, author’s work on a corpus-based approach to English subordinate clause identification is reported. The approach integrates rule-based and statistical methods to get the left and right boundaries of the subordinate clauses. The Penn Treebank corpus is used as the training standard. The precision and recall ratios of subordinate clause identification are tested on both closed and open corpora. A result of 92.9% precision and 91.26% recall is obtained for the closed test and the open test result is 80.34% precision and 83.93% recall. This algorithm has been integrated into our machine translation system. The method can also be applied to processing of any other language.
文摘This research investigates four categories of high-frequency grammatical errors from 45 students’academic writings via script analysis.A group of 36 students responded to a questionnaire focusing on their beliefs regarding these grammatical constructions.The participants are enrolled in an international school in Suzhou City.A total of 646 sentences were analyzed,among which295(45.7%)are inaccurate.Of the 364 errors found,most of them(64.0%)fall into the category of Subject-verb agreement,followed by subordinate clauses(28.6%);errors in English existential construction and the passive voice are a few.Also,the usage rate of subordinate clause is 42.7%,of which a large proportion was incorrect.