Editor’s Note:Every year a number of new words and phrases from the Internet make it into a language’s vocabulary.Sorting through the most popular buzzwords and phrases used by China’s netizens reveals a virtual s...Editor’s Note:Every year a number of new words and phrases from the Internet make it into a language’s vocabulary.Sorting through the most popular buzzwords and phrases used by China’s netizens reveals a virtual smorgasbord of linguistic ingenuity and also what was most important in the last 12 months.Let’s take a look at what was on the tip of China’s tongue in 2013展开更多
To start with,I mention that certainnew words and expressions are used rela-tively often by journalists when they com-pose reports and articles.Their writing issaid to be all right for a newspaper,butthat lacks imagin...To start with,I mention that certainnew words and expressions are used rela-tively often by journalists when they com-pose reports and articles.Their writing issaid to be all right for a newspaper,butthat lacks imagination and beauty.Exam-pies:1.After the huddle with top Democratsand Republicans,the President展开更多
New words and expressions I havementioned in my previous article(见《福建外语》1984年4期)are supple-mented and amplified with the follow-ing:1.The capital put on festive air(首都披上节日盛装)。
A novel method of constructing sentiment lexicon of new words(SLNW)is proposed to realize effective Weibo sentiment analysis by integrating existing lexicons of sentiments,lexicons of degree,negation and network.Based...A novel method of constructing sentiment lexicon of new words(SLNW)is proposed to realize effective Weibo sentiment analysis by integrating existing lexicons of sentiments,lexicons of degree,negation and network.Based on left-right entropy and mutual information(MI)neologism discovery algorithms,this new algorithm divides N-gram to obtain strings dynamically instead of relying on fixed sliding window when using Trie as data structure.The sentiment-oriented point mutual information(SO-PMI)algorithm with Laplacian smoothing is used to distinguish sentiment tendency of new words found in the data set to form SLNW by putting new words to basic sentiment lexicon.Experiments show that the sentiment analysis based on SLNW performs better than others.Precision,recall and F-measure are improved in both topic and non-topic Weibo data sets.展开更多
The auto industry, in cooperation over the past 23 years, is embracing new changes. Various new forms are finding use there which used to be dominated by introduced technology, brand name or funds.
Chinese new words are particularly problematic in Chinese natural language processing. With the fast development of Internet and information explosion, it is impossible to get a complete system lexicon for application...Chinese new words are particularly problematic in Chinese natural language processing. With the fast development of Internet and information explosion, it is impossible to get a complete system lexicon for applications in Chinese natural language processing, as new words out of dictionaries are always being created. The procedure of new words identification and POS tagging are usually separated and the features of lexical information cannot be fully used. A latent discriminative model, which combines the strengths of Latent Dynamic Conditional Random Field (LDCRF) and semi-CRF, is proposed to detect new words together with their POS synchronously regardless of the types of new words from Chinese text without being pre-segmented. Unlike semi-CRF, in proposed latent discriminative model, LDCRF is applied to generate candidate entities, which accelerates the training speed and decreases the computational cost. The complexity of proposed hidden semi-CRF could be further adjusted by tuning the number of hidden variables and the number of candidate entities from the Nbest outputs of LDCRF model. A new-word-generating framework is proposed for model training and testing, under which the definitions and distributions of new words conform to the ones in real text. The global feature called "Global Fragment Features" for new word identification is adopted. We tested our model on the corpus from SIGHAN-6. Experimental results show that the proposed method is capable of detecting even low frequency new words together with their POS tags with satisfactory results. The proposed model performs competitively with the state-of-the-art models.展开更多
文摘Editor’s Note:Every year a number of new words and phrases from the Internet make it into a language’s vocabulary.Sorting through the most popular buzzwords and phrases used by China’s netizens reveals a virtual smorgasbord of linguistic ingenuity and also what was most important in the last 12 months.Let’s take a look at what was on the tip of China’s tongue in 2013
文摘To start with,I mention that certainnew words and expressions are used rela-tively often by journalists when they com-pose reports and articles.Their writing issaid to be all right for a newspaper,butthat lacks imagination and beauty.Exam-pies:1.After the huddle with top Democratsand Republicans,the President
文摘New words and expressions I havementioned in my previous article(见《福建外语》1984年4期)are supple-mented and amplified with the follow-ing:1.The capital put on festive air(首都披上节日盛装)。
基金Natural Science Foundation of Shanghai,China(No.18ZR1401200)Special Fund for Innovation and Development of Shanghai Industrial Internet,China(No.2019-GYHLW-01004)。
文摘A novel method of constructing sentiment lexicon of new words(SLNW)is proposed to realize effective Weibo sentiment analysis by integrating existing lexicons of sentiments,lexicons of degree,negation and network.Based on left-right entropy and mutual information(MI)neologism discovery algorithms,this new algorithm divides N-gram to obtain strings dynamically instead of relying on fixed sliding window when using Trie as data structure.The sentiment-oriented point mutual information(SO-PMI)algorithm with Laplacian smoothing is used to distinguish sentiment tendency of new words found in the data set to form SLNW by putting new words to basic sentiment lexicon.Experiments show that the sentiment analysis based on SLNW performs better than others.Precision,recall and F-measure are improved in both topic and non-topic Weibo data sets.
文摘The auto industry, in cooperation over the past 23 years, is embracing new changes. Various new forms are finding use there which used to be dominated by introduced technology, brand name or funds.
基金partially supported by the Doctor Startup Fund of Liaoning Province under Grant No.20101021
文摘Chinese new words are particularly problematic in Chinese natural language processing. With the fast development of Internet and information explosion, it is impossible to get a complete system lexicon for applications in Chinese natural language processing, as new words out of dictionaries are always being created. The procedure of new words identification and POS tagging are usually separated and the features of lexical information cannot be fully used. A latent discriminative model, which combines the strengths of Latent Dynamic Conditional Random Field (LDCRF) and semi-CRF, is proposed to detect new words together with their POS synchronously regardless of the types of new words from Chinese text without being pre-segmented. Unlike semi-CRF, in proposed latent discriminative model, LDCRF is applied to generate candidate entities, which accelerates the training speed and decreases the computational cost. The complexity of proposed hidden semi-CRF could be further adjusted by tuning the number of hidden variables and the number of candidate entities from the Nbest outputs of LDCRF model. A new-word-generating framework is proposed for model training and testing, under which the definitions and distributions of new words conform to the ones in real text. The global feature called "Global Fragment Features" for new word identification is adopted. We tested our model on the corpus from SIGHAN-6. Experimental results show that the proposed method is capable of detecting even low frequency new words together with their POS tags with satisfactory results. The proposed model performs competitively with the state-of-the-art models.