The present paper describes the use of online free language resources for translating and expanding queries in CLIR (cross-language information retrieval). In a previous study, we proposed method queries that were t...The present paper describes the use of online free language resources for translating and expanding queries in CLIR (cross-language information retrieval). In a previous study, we proposed method queries that were translated by two machine translation systems on the Language Gridem. The queries were then expanded using an online dictionary to translate compound words or word phrases. A concept base was used to compare back translation words with the original query in order to delete mistranslated words. In order to evaluate the proposed method, we constructed a CLIR system and used the science documents of the NTCIR1 dataset. The proposed method achieved high precision. However~ proper nouns (names of people and places) appear infrequently in science documents. In information retrieval, proper nouns present unique problems. Since proper nouns are usually unknown words, they are difficult to find in monolingual dictionaries, not to mention bilingual dictionaries. Furthermore, the initial query of the user is not always the best description of the desired information. In order to solve this problem, and to create a better query representation, query expansion is often proposed as a solution. Wikipedia was used to translate compound words or word phrases. It was also used to expand queries together with a concept base. The NTCIRI and NTCIR 6 datasets were used to evaluate the proposed method. In the proposed method, the CLIR system was implemented with a high rate of precision. The proposed syst had a higher ranking than the NTCIRI and NTCIR6 participation systems.展开更多
Since the printing and publication of the first carved edition of"Exploitation of the Works of Nature" in 1637 (the tenth year of Chongzhen in the Ming Dynasty), it has been 380 years of history. It is an encyclop...Since the printing and publication of the first carved edition of"Exploitation of the Works of Nature" in 1637 (the tenth year of Chongzhen in the Ming Dynasty), it has been 380 years of history. It is an encyclopedia describing China's ancient agricultural and handicraft technologies, which occupies an important position in the history of the world science and technology, and the author Song Yingxing is praised by the British expert on the world science and technology history Dr. Joseph Needham as the "Chinese Diderot". This book is of great value in the language and the cultural researches. From the perspective of the linguistics, the technical terminologies on the agriculture, the handicraft and other disciplines in this paper are classified and analyzed from the semantic point of view, so as to provide an empirical reference for the better dissemination of the book and the diachronic study of the Chinese terminologies.展开更多
文摘The present paper describes the use of online free language resources for translating and expanding queries in CLIR (cross-language information retrieval). In a previous study, we proposed method queries that were translated by two machine translation systems on the Language Gridem. The queries were then expanded using an online dictionary to translate compound words or word phrases. A concept base was used to compare back translation words with the original query in order to delete mistranslated words. In order to evaluate the proposed method, we constructed a CLIR system and used the science documents of the NTCIR1 dataset. The proposed method achieved high precision. However~ proper nouns (names of people and places) appear infrequently in science documents. In information retrieval, proper nouns present unique problems. Since proper nouns are usually unknown words, they are difficult to find in monolingual dictionaries, not to mention bilingual dictionaries. Furthermore, the initial query of the user is not always the best description of the desired information. In order to solve this problem, and to create a better query representation, query expansion is often proposed as a solution. Wikipedia was used to translate compound words or word phrases. It was also used to expand queries together with a concept base. The NTCIRI and NTCIR 6 datasets were used to evaluate the proposed method. In the proposed method, the CLIR system was implemented with a high rate of precision. The proposed syst had a higher ranking than the NTCIRI and NTCIR6 participation systems.
文摘Since the printing and publication of the first carved edition of"Exploitation of the Works of Nature" in 1637 (the tenth year of Chongzhen in the Ming Dynasty), it has been 380 years of history. It is an encyclopedia describing China's ancient agricultural and handicraft technologies, which occupies an important position in the history of the world science and technology, and the author Song Yingxing is praised by the British expert on the world science and technology history Dr. Joseph Needham as the "Chinese Diderot". This book is of great value in the language and the cultural researches. From the perspective of the linguistics, the technical terminologies on the agriculture, the handicraft and other disciplines in this paper are classified and analyzed from the semantic point of view, so as to provide an empirical reference for the better dissemination of the book and the diachronic study of the Chinese terminologies.