Text similarity has a relatively wide range of applications in many fields, such as intelligent information retrieval, question answering system, text rechecking, machine translation, and so on. The text similarity co...Text similarity has a relatively wide range of applications in many fields, such as intelligent information retrieval, question answering system, text rechecking, machine translation, and so on. The text similarity computing based on the meaning has been used more widely in the similarity computing of the words and phrase. Using the knowledge structure of the and its method of knowledge description, taking into account the other factor and weight that influenced similarity, making full use of depth and density of the Concept-Sememe tree, an improved method of Chinese word similarity calculation based on semantic distance was provided in this paper. Finally the effectiveness of this method was verified by the simulation results.展开更多
Chinese pinyin, the commonly used system for Romanizing standard Chinese, is a special form of the language. Compared with Chinese characters, pin,in boasts an advantage in the process of spreading Chinese culture aro...Chinese pinyin, the commonly used system for Romanizing standard Chinese, is a special form of the language. Compared with Chinese characters, pin,in boasts an advantage in the process of spreading Chinese culture around the world.展开更多
Objective When English-speaking people listen to the Deutsch "high-low" word illusion, they report hearing English words. Whether Chinese-speaking people report Chinese words when listening to the illusion, or wheth...Objective When English-speaking people listen to the Deutsch "high-low" word illusion, they report hearing English words. Whether Chinese-speaking people report Chinese words when listening to the illusion, or whether any reported words might be correlated with personality traits as previous investigations have demonstrated for listening to music in other cultures, is open to question. The present study aimed to address this. Methods A total of 308 right-handed, healthy volunteers (177 women and 131 men) were given the illusion test and asked to answer the Zuckerman-Kuhlman personality questionnaire (ZKPQ). Their depressive tendency was measured by the Plutchik-van Praag depression inventory (PVP). Results There was no gender effect regarding either the PVP score or the number of reported Chinese words from the illusion. Women scored higher on ZKPQ neuroticism-anxiety than men. The number of meaningful Chinese words reported was correlated with the ZKPQ impulsive sensation-seeking, aggression-hostility, and activity scores. Some words reported by participants who scored higher on these three traits were related in meaning to those scales. Conclusion Our preliminary results suggest that when Chinese-speaking people listen to the Deutsch "high-low" word illusion, they might use personality-related, specific cognitive schemata.展开更多
Patients with major depressive disorder (MDD) develop a negative cognitive bias, but how they respond to information in Chinese emotional words is unclear. Here we used a Stroop paradigm with subliminal Chinese emot...Patients with major depressive disorder (MDD) develop a negative cognitive bias, but how they respond to information in Chinese emotional words is unclear. Here we used a Stroop paradigm with subliminal Chinese emotional words to explore the event-related potential components of abnormal emotional processing Jn patients with MDD. The correct rate was similar in MDD and normal control groups, but MDD reaction time was longer than the normal controls, especially to the negative and neutral stimuli. In N270, repeated-measure analysis of variance demonstrated a significant main effect of the relation electrode and valence on peak amplitude and interactions between valence and electrode site. The peak amplitudes of the three kinds of words were different in the two groups (positive 〉 negative 〉 neutral). The topography of the difference waves indicated that the difference distributed in the frontal and left parietal-temporal sites across the scalp. In N400, there was a significant main effect of the relation electrode and valence on peak amplitude, and the latency showed a main effect of the electrode and an interaction between electrode and group. The amplitudes induced by type of words were significantly different from each other in both groups (positive 〉 negative 〉 neutral). The topography of the difference waves indicated that the effect of relation type was primarily at left and right frontal and central and left parietal-temporal regions. Both MDD patients and normal controls exhibited significant emotional Stroop effects during the processing of positive/negative Chinese emotional words. MDD patients showed interference in emotional stimuli in early cognitive processing that induced psychological resource intervention during late emotional information processing.展开更多
This paper analyzes previous studies on animal words and points out some existing problems. It also reveals the similarities and differences between Chinese and English animal words based on the data and research as w...This paper analyzes previous studies on animal words and points out some existing problems. It also reveals the similarities and differences between Chinese and English animal words based on the data and research as well as find out the reasons why animal words are different in cultural connotation.展开更多
This paper deals with the translation strategies of Chinese Culture-Loaded Words from the perspective of adaptation theory.It is based on the translation text of the sixth episode“Silk Road”and the seventh episode“...This paper deals with the translation strategies of Chinese Culture-Loaded Words from the perspective of adaptation theory.It is based on the translation text of the sixth episode“Silk Road”and the seventh episode“Dunhuang”of the documentary Hexi Corridor.Many words with Chinese cultural connotations appear in the subtitles of this documentary.This paper will be divided into four parts.The first part and the second part deal with the basic theories,i.e.,definition of Chinese Culture-Loaded Words and of adaptation theory.The original text is analysed in the third part.This part deals with the background and specifics of the language of the documentary film Hexi Corridor.The fourth part deals with the difficulties encountered by the author in translation practice and the corresponding solutions adopted by the author.The translation difficulties are solved by five translation methods,namely transliteration,loan translation,substitution,interpretation,and adaptation.展开更多
BACKGROUND: Studies have shown that closed-class words, such as prepositions and conjunctions, induce a left anterior negativity (N280), indicating that N280 should be a specific component of the word category. OBJ...BACKGROUND: Studies have shown that closed-class words, such as prepositions and conjunctions, induce a left anterior negativity (N280), indicating that N280 should be a specific component of the word category. OBJECTIVE: To observe if Chinese prepositions and verbs exhibit different linguistic functions, to determine whether they are processed by different neural systems, and to verify that N280 is a specific component. DESIGN, TIME AND SETTING: The observed neurolinguistics experiment was performed at Xuzhou Normal University between November and December 2006. PARTICIPANTS: Sixteen undergraduate students, comprising 8 females and 8 males, with no mental or neuropathological history, were selected. METHODS: A total of 15 verbs and prepositions were used as linguistic stimuli, and each verb and preposition was combined to produce four correct phrase collocations and four incorrect ones. MAIN OUTCOME MEASURES: Event-related potentials were recorded in the subjects while they read correct or incorrect phases flashed upon a video screen. RESULTS: Both verbs and prepositions elicited negativity at the frontal site in a 230-330 ms window, as well as at the fronto-temporal and central sites in a 350-500 ms window. Neither exhibited significant differences in peak [F(1, 15) = 0.144, P = 0.710] and latency [F(1, 15) = 0.144, P= 0.710]. Both verbs and prepositions elicited negativity at the left and right hemisphere in a 270-400 ms window. CONCLUSION: There was no significant difference between Chinese prepositions and verbs in the neural system process and N280 was not the specific component for closed-class words.展开更多
Lexes are the most important and basic element of a language.Chinese culture-loaded lexes are those words or expressions that are greatly rich in Chinese culture.They can reflect the characteristics of Chinese culture...Lexes are the most important and basic element of a language.Chinese culture-loaded lexes are those words or expressions that are greatly rich in Chinese culture.They can reflect the characteristics of Chinese culture and Chinese nation.Therefore,it is of great significance to pay attention to the translation of Chinese culture-loaded lexes as they play a decisive role in disseminating Chinese culture.It can help promote Chinese culture worldwide,improve China's cultural exchanges and communication with other nations and strengthen China's status in the world.This paper focuses on the Chinese culture-loaded words and proposes some possible means of translation with the purpose of spreading Chinese culture.展开更多
South Africa Can Work: How a free market and decentralised government will make us a winning nationBy Frans Rautenbach Penguin Random House South AfricaWhat will it take to turn South Mrica around? In this insightfu...South Africa Can Work: How a free market and decentralised government will make us a winning nationBy Frans Rautenbach Penguin Random House South AfricaWhat will it take to turn South Mrica around? In this insightful book, Frans Rautenbach proposes a complete overhaul of policy thinking, and provides fresh arguments that effectively address South Africa's high unemployment and lack of education. Rautenbach examines the fundamental problem of rent-seeking, to which he proposes two antidotes: the flee market and decentralization of government. Along the way he tackles holy cows such as affirmative action, trade unions,展开更多
In recent years,abundant words with Chinese characteristics in media translation show that on the one hand,the unique society,politics,economics and culture,and on the other hand,the important source of foreign words....In recent years,abundant words with Chinese characteristics in media translation show that on the one hand,the unique society,politics,economics and culture,and on the other hand,the important source of foreign words.According to post colonial cultural translation theory,through different hybridization tactics,they formed the third part,which is different from the source language and target language,and is the product of globalization.This notion of hybridity not only has great bearing on translators’selection of translating strategies,but also helps to change some of the accepted conception about translation,and thus enables us to have a better understanding of media translation.This promotes China’s discourse power in international society and the objectivity of reports of China,which constructs the third space in media translation between the West and China.展开更多
Contains 210 frequentlyused measure words Includes nominal measure words,verbal measure words,concurrent measure words,etc.Compiled according to the HSK examination outline Features multiple retrieval
Semi-Markov conditional random fields(Semi-CRFs)have been successfully utilized in many segmentation problems,including Chinese word segmentation(CWS).The advantage of Semi-CRF lies in its inherent ability to exploit ...Semi-Markov conditional random fields(Semi-CRFs)have been successfully utilized in many segmentation problems,including Chinese word segmentation(CWS).The advantage of Semi-CRF lies in its inherent ability to exploit properties of segments instead of individual elements of sequences.Despite its theoretical advantage,Semi-CRF is still not the best choice for CWS because its computation complexity is quadratic to the sentenced length.In this paper,we propose a simple yet effective framework to help Semi-CRF achieve comparable performance with CRF-based models under similar computation complexity.Specifically,we first adopt a bi-directional long short-term memory(BiLSTM)on character level to model the context information,and then use simple but effective fusion layer to represent the segment information.Besides,to model arbitrarily long segments within linear time complexity,we also propose a new model named Semi-CRF-Relay.The direct modeling of segments makes the combination with word features easy and the CWS performance can be enhanced merely by adding publicly available pre-trained word embeddings.Experiments on four popular CWS datasets show the effectiveness of our proposed methods.The source codes and pre-trained embeddings of this paper are available on https://github.com/fastnlp/fastNLP/.展开更多
As a powerful sequence labeling model, conditional random fields (CRFs) have had successful applications in many natural language processing (NLP) tasks. However, the high complexity of CRFs training only allows a...As a powerful sequence labeling model, conditional random fields (CRFs) have had successful applications in many natural language processing (NLP) tasks. However, the high complexity of CRFs training only allows a very small tag (or label) set, because the training becomes intractable as the tag set enlarges. This paper proposes an improved decomposed training and joint decoding algorithm for CRF learning. Instead of training a single CRF model for all tags, it trains a binary sub-CRF independently for each tag. An optimal tag sequence is then produced by a joint decoding algorithm based on the probabilistic output of all sub-CRFs involved. To test its effectiveness, we apply this approach to tackling Chinese word segmentation (CWS) as a sequence labeling problem. Our evaluation shows that it can reduce the computational cost of this language processing task by 40-50% without any significant performance loss on various large-scale data sets.展开更多
The resolution of overlapping ambiguity strings(OAS)is studied based on the maximum entropy model.There are two model outputs,where either the first two characters form a word or the last two characters form a word.Th...The resolution of overlapping ambiguity strings(OAS)is studied based on the maximum entropy model.There are two model outputs,where either the first two characters form a word or the last two characters form a word.The features of the model include one word in con-text of OAS,the current OAS and word probability relation of two kinds of segmentation results.OAS in training text is found by the combination of the FMM and BMM segmen-tation method.After feature tagging they are used to train the maximum entropy model.The People Daily corpus of January 1998 is used in training and testing.Experimental results show a closed test precision of 98.64%and an open test precision of 95.01%.The open test precision is 3.76%better compared with that of the precision of common word probability method.展开更多
Chinese word segmentation plays an important role in search engine,artificial intelligence,machine translation and so on.There are currently three main word segmentation algorithms:dictionary-based word segmentation a...Chinese word segmentation plays an important role in search engine,artificial intelligence,machine translation and so on.There are currently three main word segmentation algorithms:dictionary-based word segmentation algorithms,statistics-based word segmentation algorithms,and understandingbased word segmentation algorithms.However,few people combine these three methods or two of them.Therefore,a Chinese word segmentation model is proposed based on a combination of statistical word segmentation algorithm and understanding-based word segmentation algorithm.It combines Hidden Markov Model(HMM)word segmentation and Bi-LSTM word segmentation to improve accuracy.The main method is to make lexical statistics on the results of the two participles,and to choose the best results based on the statistical results,and then to combine them into the final word segmentation results.This combined word segmentation model is applied to perform experiments on the MSRA corpus provided by Bakeoff.Experiments show that the accuracy of word segmentation results is 12.52%higher than that of traditional HMM model and 0.19%higher than that of BI-LSTM model.展开更多
文摘Text similarity has a relatively wide range of applications in many fields, such as intelligent information retrieval, question answering system, text rechecking, machine translation, and so on. The text similarity computing based on the meaning has been used more widely in the similarity computing of the words and phrase. Using the knowledge structure of the and its method of knowledge description, taking into account the other factor and weight that influenced similarity, making full use of depth and density of the Concept-Sememe tree, an improved method of Chinese word similarity calculation based on semantic distance was provided in this paper. Finally the effectiveness of this method was verified by the simulation results.
文摘Chinese pinyin, the commonly used system for Romanizing standard Chinese, is a special form of the language. Compared with Chinese characters, pin,in boasts an advantage in the process of spreading Chinese culture around the world.
基金supported by grants from the National Natural Science Foundation of China (30971042)the Innovative Research Team for Translational Neuropsychiatric Medicine, Zhejiang Province (2011R50049)the Program for Changjiang Scholars and Innovative Research Team in University,Chinese Ministry of Education (IRT1038)
文摘Objective When English-speaking people listen to the Deutsch "high-low" word illusion, they report hearing English words. Whether Chinese-speaking people report Chinese words when listening to the illusion, or whether any reported words might be correlated with personality traits as previous investigations have demonstrated for listening to music in other cultures, is open to question. The present study aimed to address this. Methods A total of 308 right-handed, healthy volunteers (177 women and 131 men) were given the illusion test and asked to answer the Zuckerman-Kuhlman personality questionnaire (ZKPQ). Their depressive tendency was measured by the Plutchik-van Praag depression inventory (PVP). Results There was no gender effect regarding either the PVP score or the number of reported Chinese words from the illusion. Women scored higher on ZKPQ neuroticism-anxiety than men. The number of meaningful Chinese words reported was correlated with the ZKPQ impulsive sensation-seeking, aggression-hostility, and activity scores. Some words reported by participants who scored higher on these three traits were related in meaning to those scales. Conclusion Our preliminary results suggest that when Chinese-speaking people listen to the Deutsch "high-low" word illusion, they might use personality-related, specific cognitive schemata.
基金the National Natural Science Foundation of China,No.30570609
文摘Patients with major depressive disorder (MDD) develop a negative cognitive bias, but how they respond to information in Chinese emotional words is unclear. Here we used a Stroop paradigm with subliminal Chinese emotional words to explore the event-related potential components of abnormal emotional processing Jn patients with MDD. The correct rate was similar in MDD and normal control groups, but MDD reaction time was longer than the normal controls, especially to the negative and neutral stimuli. In N270, repeated-measure analysis of variance demonstrated a significant main effect of the relation electrode and valence on peak amplitude and interactions between valence and electrode site. The peak amplitudes of the three kinds of words were different in the two groups (positive 〉 negative 〉 neutral). The topography of the difference waves indicated that the difference distributed in the frontal and left parietal-temporal sites across the scalp. In N400, there was a significant main effect of the relation electrode and valence on peak amplitude, and the latency showed a main effect of the electrode and an interaction between electrode and group. The amplitudes induced by type of words were significantly different from each other in both groups (positive 〉 negative 〉 neutral). The topography of the difference waves indicated that the effect of relation type was primarily at left and right frontal and central and left parietal-temporal regions. Both MDD patients and normal controls exhibited significant emotional Stroop effects during the processing of positive/negative Chinese emotional words. MDD patients showed interference in emotional stimuli in early cognitive processing that induced psychological resource intervention during late emotional information processing.
文摘This paper analyzes previous studies on animal words and points out some existing problems. It also reveals the similarities and differences between Chinese and English animal words based on the data and research as well as find out the reasons why animal words are different in cultural connotation.
基金Research Startup Project for Doctors at the School of Foreign Languages,University of Shanghai for Science and Technology(Fund Project No.:1F-21-305-101).
文摘This paper deals with the translation strategies of Chinese Culture-Loaded Words from the perspective of adaptation theory.It is based on the translation text of the sixth episode“Silk Road”and the seventh episode“Dunhuang”of the documentary Hexi Corridor.Many words with Chinese cultural connotations appear in the subtitles of this documentary.This paper will be divided into four parts.The first part and the second part deal with the basic theories,i.e.,definition of Chinese Culture-Loaded Words and of adaptation theory.The original text is analysed in the third part.This part deals with the background and specifics of the language of the documentary film Hexi Corridor.The fourth part deals with the difficulties encountered by the author in translation practice and the corresponding solutions adopted by the author.The translation difficulties are solved by five translation methods,namely transliteration,loan translation,substitution,interpretation,and adaptation.
基金National Social Science Foundation in China,No.03BYY013The Science Foundation of Jiangsu Province,No."333" Project and QL200504
文摘BACKGROUND: Studies have shown that closed-class words, such as prepositions and conjunctions, induce a left anterior negativity (N280), indicating that N280 should be a specific component of the word category. OBJECTIVE: To observe if Chinese prepositions and verbs exhibit different linguistic functions, to determine whether they are processed by different neural systems, and to verify that N280 is a specific component. DESIGN, TIME AND SETTING: The observed neurolinguistics experiment was performed at Xuzhou Normal University between November and December 2006. PARTICIPANTS: Sixteen undergraduate students, comprising 8 females and 8 males, with no mental or neuropathological history, were selected. METHODS: A total of 15 verbs and prepositions were used as linguistic stimuli, and each verb and preposition was combined to produce four correct phrase collocations and four incorrect ones. MAIN OUTCOME MEASURES: Event-related potentials were recorded in the subjects while they read correct or incorrect phases flashed upon a video screen. RESULTS: Both verbs and prepositions elicited negativity at the frontal site in a 230-330 ms window, as well as at the fronto-temporal and central sites in a 350-500 ms window. Neither exhibited significant differences in peak [F(1, 15) = 0.144, P = 0.710] and latency [F(1, 15) = 0.144, P= 0.710]. Both verbs and prepositions elicited negativity at the left and right hemisphere in a 270-400 ms window. CONCLUSION: There was no significant difference between Chinese prepositions and verbs in the neural system process and N280 was not the specific component for closed-class words.
文摘Lexes are the most important and basic element of a language.Chinese culture-loaded lexes are those words or expressions that are greatly rich in Chinese culture.They can reflect the characteristics of Chinese culture and Chinese nation.Therefore,it is of great significance to pay attention to the translation of Chinese culture-loaded lexes as they play a decisive role in disseminating Chinese culture.It can help promote Chinese culture worldwide,improve China's cultural exchanges and communication with other nations and strengthen China's status in the world.This paper focuses on the Chinese culture-loaded words and proposes some possible means of translation with the purpose of spreading Chinese culture.
文摘South Africa Can Work: How a free market and decentralised government will make us a winning nationBy Frans Rautenbach Penguin Random House South AfricaWhat will it take to turn South Mrica around? In this insightful book, Frans Rautenbach proposes a complete overhaul of policy thinking, and provides fresh arguments that effectively address South Africa's high unemployment and lack of education. Rautenbach examines the fundamental problem of rent-seeking, to which he proposes two antidotes: the flee market and decentralization of government. Along the way he tackles holy cows such as affirmative action, trade unions,
文摘In recent years,abundant words with Chinese characteristics in media translation show that on the one hand,the unique society,politics,economics and culture,and on the other hand,the important source of foreign words.According to post colonial cultural translation theory,through different hybridization tactics,they formed the third part,which is different from the source language and target language,and is the product of globalization.This notion of hybridity not only has great bearing on translators’selection of translating strategies,but also helps to change some of the accepted conception about translation,and thus enables us to have a better understanding of media translation.This promotes China’s discourse power in international society and the objectivity of reports of China,which constructs the third space in media translation between the West and China.
文摘Contains 210 frequentlyused measure words Includes nominal measure words,verbal measure words,concurrent measure words,etc.Compiled according to the HSK examination outline Features multiple retrieval
基金supported by the National Natural Science Foundation of China under Grant Nos.61751201 arid 61672162the Shanghai Municipal Science and Technology Major Project under Grant Nos.2018SHZDZX01 and ZJLab.
文摘Semi-Markov conditional random fields(Semi-CRFs)have been successfully utilized in many segmentation problems,including Chinese word segmentation(CWS).The advantage of Semi-CRF lies in its inherent ability to exploit properties of segments instead of individual elements of sequences.Despite its theoretical advantage,Semi-CRF is still not the best choice for CWS because its computation complexity is quadratic to the sentenced length.In this paper,we propose a simple yet effective framework to help Semi-CRF achieve comparable performance with CRF-based models under similar computation complexity.Specifically,we first adopt a bi-directional long short-term memory(BiLSTM)on character level to model the context information,and then use simple but effective fusion layer to represent the segment information.Besides,to model arbitrarily long segments within linear time complexity,we also propose a new model named Semi-CRF-Relay.The direct modeling of segments makes the combination with word features easy and the CWS performance can be enhanced merely by adding publicly available pre-trained word embeddings.Experiments on four popular CWS datasets show the effectiveness of our proposed methods.The source codes and pre-trained embeddings of this paper are available on https://github.com/fastnlp/fastNLP/.
基金the Research Grants Council of Hong Kong S.A.R.,China,through the CERG under Grant No.9040861(CityU 1318/03H)City University of Hong Kong through the Strategic Research under Grant No.7002037.
文摘As a powerful sequence labeling model, conditional random fields (CRFs) have had successful applications in many natural language processing (NLP) tasks. However, the high complexity of CRFs training only allows a very small tag (or label) set, because the training becomes intractable as the tag set enlarges. This paper proposes an improved decomposed training and joint decoding algorithm for CRF learning. Instead of training a single CRF model for all tags, it trains a binary sub-CRF independently for each tag. An optimal tag sequence is then produced by a joint decoding algorithm based on the probabilistic output of all sub-CRFs involved. To test its effectiveness, we apply this approach to tackling Chinese word segmentation (CWS) as a sequence labeling problem. Our evaluation shows that it can reduce the computational cost of this language processing task by 40-50% without any significant performance loss on various large-scale data sets.
文摘The resolution of overlapping ambiguity strings(OAS)is studied based on the maximum entropy model.There are two model outputs,where either the first two characters form a word or the last two characters form a word.The features of the model include one word in con-text of OAS,the current OAS and word probability relation of two kinds of segmentation results.OAS in training text is found by the combination of the FMM and BMM segmen-tation method.After feature tagging they are used to train the maximum entropy model.The People Daily corpus of January 1998 is used in training and testing.Experimental results show a closed test precision of 98.64%and an open test precision of 95.01%.The open test precision is 3.76%better compared with that of the precision of common word probability method.
基金a National Nature Science Fund Project(61661051)Key Laboratory of Education Information of Nationalities Ministry of Education+2 种基金Yunnan Key Laboratory of Smart EducationProgram for innovative research team (in Scienceand Technology) in University of Yunnan ProvinceKunming Key Laboratory of EducationInformation.
文摘Chinese word segmentation plays an important role in search engine,artificial intelligence,machine translation and so on.There are currently three main word segmentation algorithms:dictionary-based word segmentation algorithms,statistics-based word segmentation algorithms,and understandingbased word segmentation algorithms.However,few people combine these three methods or two of them.Therefore,a Chinese word segmentation model is proposed based on a combination of statistical word segmentation algorithm and understanding-based word segmentation algorithm.It combines Hidden Markov Model(HMM)word segmentation and Bi-LSTM word segmentation to improve accuracy.The main method is to make lexical statistics on the results of the two participles,and to choose the best results based on the statistical results,and then to combine them into the final word segmentation results.This combined word segmentation model is applied to perform experiments on the MSRA corpus provided by Bakeoff.Experiments show that the accuracy of word segmentation results is 12.52%higher than that of traditional HMM model and 0.19%higher than that of BI-LSTM model.