This paper describes the design of CEMT-Ⅱ, an interactive Chinese-English machine translation system. Based on the CEMT-Ⅰsystem, CEMT-Ⅱwill be developed to have the ability to translate Chinese scientific documents...This paper describes the design of CEMT-Ⅱ, an interactive Chinese-English machine translation system. Based on the CEMT-Ⅰsystem, CEMT-Ⅱwill be developed to have the ability to translate Chinese scientific documents into English. Now an user-friendly interface has been worked out to solve various complex ambiguities. The Chinese user need not know English well since all the questions and choices are expressed in Chinese.展开更多
A hybrid approach to English Part-of-Speech(PoS) tagging with its target application being English-Chinese machine translation in business domain is presented,demonstrating how a present tagger can be adapted to learn...A hybrid approach to English Part-of-Speech(PoS) tagging with its target application being English-Chinese machine translation in business domain is presented,demonstrating how a present tagger can be adapted to learn from a small amount of data and handle unknown words for the purpose of machine translation.A small size of 998 k English annotated corpus in business domain is built semi-automatically based on a new tagset;the maximum entropy model is adopted,and rule-based approach is used in post-processing.The tagger is further applied in Noun Phrase(NP) chunking.Experiments show that our tagger achieves an accuracy of 98.14%,which is a quite satisfactory result.In the application to NP chunking,the tagger gives rise to 2.21% increase in F-score,compared with the results using Stanford tagger.展开更多
文摘This paper describes the design of CEMT-Ⅱ, an interactive Chinese-English machine translation system. Based on the CEMT-Ⅰsystem, CEMT-Ⅱwill be developed to have the ability to translate Chinese scientific documents into English. Now an user-friendly interface has been worked out to solve various complex ambiguities. The Chinese user need not know English well since all the questions and choices are expressed in Chinese.
基金supported by the National Natural Science Foundation of China under Grant No.61173100the Fundamental Research Funds for the Central Universities under Grant No.GDUT10RW202
文摘A hybrid approach to English Part-of-Speech(PoS) tagging with its target application being English-Chinese machine translation in business domain is presented,demonstrating how a present tagger can be adapted to learn from a small amount of data and handle unknown words for the purpose of machine translation.A small size of 998 k English annotated corpus in business domain is built semi-automatically based on a new tagset;the maximum entropy model is adopted,and rule-based approach is used in post-processing.The tagger is further applied in Noun Phrase(NP) chunking.Experiments show that our tagger achieves an accuracy of 98.14%,which is a quite satisfactory result.In the application to NP chunking,the tagger gives rise to 2.21% increase in F-score,compared with the results using Stanford tagger.