Chinese new words are particularly problematic in Chinese natural language processing. With the fast development of Internet and information explosion, it is impossible to get a complete system lexicon for application...Chinese new words are particularly problematic in Chinese natural language processing. With the fast development of Internet and information explosion, it is impossible to get a complete system lexicon for applications in Chinese natural language processing, as new words out of dictionaries are always being created. The procedure of new words identification and POS tagging are usually separated and the features of lexical information cannot be fully used. A latent discriminative model, which combines the strengths of Latent Dynamic Conditional Random Field (LDCRF) and semi-CRF, is proposed to detect new words together with their POS synchronously regardless of the types of new words from Chinese text without being pre-segmented. Unlike semi-CRF, in proposed latent discriminative model, LDCRF is applied to generate candidate entities, which accelerates the training speed and decreases the computational cost. The complexity of proposed hidden semi-CRF could be further adjusted by tuning the number of hidden variables and the number of candidate entities from the Nbest outputs of LDCRF model. A new-word-generating framework is proposed for model training and testing, under which the definitions and distributions of new words conform to the ones in real text. The global feature called "Global Fragment Features" for new word identification is adopted. We tested our model on the corpus from SIGHAN-6. Experimental results show that the proposed method is capable of detecting even low frequency new words together with their POS tags with satisfactory results. The proposed model performs competitively with the state-of-the-art models.展开更多
As the global economy has become further integrated, the international production chain has become more sophisticated, with diversified stages of production located in different countries. Economic theorists have argu...As the global economy has become further integrated, the international production chain has become more sophisticated, with diversified stages of production located in different countries. Economic theorists have argued that the fragmentation of the global production chain is partly attributable to the high growth in international trade over the past several decades. In this study, we examine vertical specialization in China, Japan and Korea, and its contribution to these nations' trade. Using a multilevel model, it is illustrated that vertical specialization has encouraged increases in trade among all three countries. In particular, China 's outcome is remarkable considering how recently it became a member of the WTO.展开更多
基金partially supported by the Doctor Startup Fund of Liaoning Province under Grant No.20101021
文摘Chinese new words are particularly problematic in Chinese natural language processing. With the fast development of Internet and information explosion, it is impossible to get a complete system lexicon for applications in Chinese natural language processing, as new words out of dictionaries are always being created. The procedure of new words identification and POS tagging are usually separated and the features of lexical information cannot be fully used. A latent discriminative model, which combines the strengths of Latent Dynamic Conditional Random Field (LDCRF) and semi-CRF, is proposed to detect new words together with their POS synchronously regardless of the types of new words from Chinese text without being pre-segmented. Unlike semi-CRF, in proposed latent discriminative model, LDCRF is applied to generate candidate entities, which accelerates the training speed and decreases the computational cost. The complexity of proposed hidden semi-CRF could be further adjusted by tuning the number of hidden variables and the number of candidate entities from the Nbest outputs of LDCRF model. A new-word-generating framework is proposed for model training and testing, under which the definitions and distributions of new words conform to the ones in real text. The global feature called "Global Fragment Features" for new word identification is adopted. We tested our model on the corpus from SIGHAN-6. Experimental results show that the proposed method is capable of detecting even low frequency new words together with their POS tags with satisfactory results. The proposed model performs competitively with the state-of-the-art models.
基金supported by a National R esearch Foundation of Korea grant founded by the Korean Governmenl(NRF-2010-327-B00342)
文摘As the global economy has become further integrated, the international production chain has become more sophisticated, with diversified stages of production located in different countries. Economic theorists have argued that the fragmentation of the global production chain is partly attributable to the high growth in international trade over the past several decades. In this study, we examine vertical specialization in China, Japan and Korea, and its contribution to these nations' trade. Using a multilevel model, it is illustrated that vertical specialization has encouraged increases in trade among all three countries. In particular, China 's outcome is remarkable considering how recently it became a member of the WTO.