As you can see from the title,we try to provide a brief analysis of ambiguity of words' meaning in translation practice,namely,what is ambiguity of a meaning of a word and how it is caused,and how it affects trans...As you can see from the title,we try to provide a brief analysis of ambiguity of words' meaning in translation practice,namely,what is ambiguity of a meaning of a word and how it is caused,and how it affects translation.What effective methods are available to help us achieve a successful translation? Although a perfect solution for the problems presented by the ambiguity of words' meaning is difficult to find,we try to do what it can to contribute to the solution and draw people's attention to those problems.展开更多
In order to improve Chinese overlapping ambiguity resolution based on a support vector machine, statistical features are studied for representing the feature vectors. First, four statistical parameters-mutual informat...In order to improve Chinese overlapping ambiguity resolution based on a support vector machine, statistical features are studied for representing the feature vectors. First, four statistical parameters-mutual information, accessor variety, two-character word frequency and single-character word frequency are used to describe the feature vectors respectively. Then other parameters are tried to add as complementary features to the parameters which obtain the best results for further improving the classification performance. Experimental results show that features represented by mutual information, single-character word frequency and accessor variety can obtain an optimum result of 94. 39%. Compared with a commonly used word probability model, the accuracy has been improved by 6. 62%. Such comparative results confirm that the classification performance can be improved by feature selection and representation.展开更多
This paper compared several methods of machine translation(MT) design, drew lessons from the idea of phrase structure, GPSG, HPSG and Corpus, took words as the core, built a set of word rules, and developed an English...This paper compared several methods of machine translation(MT) design, drew lessons from the idea of phrase structure, GPSG, HPSG and Corpus, took words as the core, built a set of word rules, and developed an English Chinese Machine Translation System based on it. The paper also discussed some technical problems on building MT system, and provided an estimation principle for using rules. With this principle the syntax ambiguities in MT system are solved better.展开更多
Automatic word-segmentation is widely used in the ambiguity cancellation when processing large-scale real text,but during the process of unknown word detection in Chinese word segmentation,many detected word candidate...Automatic word-segmentation is widely used in the ambiguity cancellation when processing large-scale real text,but during the process of unknown word detection in Chinese word segmentation,many detected word candidates are invalid.These false unknown word candidates deteriorate the overall segmentation accuracy,as it will affect the segmentation accuracy of known words.In this paper,we propose several methods for reducing the difficulties and improving the accuracy of the word-segmentation of written Chinese,such as full segmentation of a sentence,processing the duplicative word,idioms and statistical identification for unknown words.A simulation shows the feasibility of our proposed methods in improving the accuracy of word-segmentation of Chinese.展开更多
文摘As you can see from the title,we try to provide a brief analysis of ambiguity of words' meaning in translation practice,namely,what is ambiguity of a meaning of a word and how it is caused,and how it affects translation.What effective methods are available to help us achieve a successful translation? Although a perfect solution for the problems presented by the ambiguity of words' meaning is difficult to find,we try to do what it can to contribute to the solution and draw people's attention to those problems.
文摘In order to improve Chinese overlapping ambiguity resolution based on a support vector machine, statistical features are studied for representing the feature vectors. First, four statistical parameters-mutual information, accessor variety, two-character word frequency and single-character word frequency are used to describe the feature vectors respectively. Then other parameters are tried to add as complementary features to the parameters which obtain the best results for further improving the classification performance. Experimental results show that features represented by mutual information, single-character word frequency and accessor variety can obtain an optimum result of 94. 39%. Compared with a commonly used word probability model, the accuracy has been improved by 6. 62%. Such comparative results confirm that the classification performance can be improved by feature selection and representation.
文摘This paper compared several methods of machine translation(MT) design, drew lessons from the idea of phrase structure, GPSG, HPSG and Corpus, took words as the core, built a set of word rules, and developed an English Chinese Machine Translation System based on it. The paper also discussed some technical problems on building MT system, and provided an estimation principle for using rules. With this principle the syntax ambiguities in MT system are solved better.
文摘Automatic word-segmentation is widely used in the ambiguity cancellation when processing large-scale real text,but during the process of unknown word detection in Chinese word segmentation,many detected word candidates are invalid.These false unknown word candidates deteriorate the overall segmentation accuracy,as it will affect the segmentation accuracy of known words.In this paper,we propose several methods for reducing the difficulties and improving the accuracy of the word-segmentation of written Chinese,such as full segmentation of a sentence,processing the duplicative word,idioms and statistical identification for unknown words.A simulation shows the feasibility of our proposed methods in improving the accuracy of word-segmentation of Chinese.