Authors of papers to proceedings have to type these in a form suitable for direct photographic reproduction bthrough the research methods about analyzing computer corpus translation and inter-language research methods...Authors of papers to proceedings have to type these in a form suitable for direct photographic reproduction bthrough the research methods about analyzing computer corpus translation and inter-language research methods, and through comparison analysis about three of more thaxL four hundred thousand words, link adverbs in Chinese and English corpus and its corresponding Chinese word retrieval results, and make a detailed description of the characteristic for Chinese English learners and native English speakers, and thus provides an application example to study corpus in translation.展开更多
A lack of labeled corpora obstructs the research progress on implicit discourse relation recognition (DRR) for Chinese, while there are some available discourse corpora in other languages, such as English. In this p...A lack of labeled corpora obstructs the research progress on implicit discourse relation recognition (DRR) for Chinese, while there are some available discourse corpora in other languages, such as English. In this paper, we propose a cross-lingual implicit DRR framework that exploits an available English corpus for the Chinese DRR task. We use machine translation to generate Chinese instances from a labeled English discourse corpus. In this way, each instance has two independent views: Chinese and English views. Then we train two classifiers in Chinese and English in a co-training way, which exploits unlabeled Chinese data to implement better implicit DRR for Chinese. Experimental results demonstrate the effectiveness of our method.展开更多
文摘Authors of papers to proceedings have to type these in a form suitable for direct photographic reproduction bthrough the research methods about analyzing computer corpus translation and inter-language research methods, and through comparison analysis about three of more thaxL four hundred thousand words, link adverbs in Chinese and English corpus and its corresponding Chinese word retrieval results, and make a detailed description of the characteristic for Chinese English learners and native English speakers, and thus provides an application example to study corpus in translation.
基金Project supported by the National Natural Science Foundation of China(No.61672440)the Natural Science Foundation of Fujian Province,China(No.2016J05161)+2 种基金the Research Fund of the State Key Laboratory for Novel Software Technology in Nanjing University,China(No.KFKT2015B11)the Scientific Research Project of the National Language Committee of China(No.YB135-49)the Fundamental Research Funds for the Central Universities,China(No.ZK1024)
文摘A lack of labeled corpora obstructs the research progress on implicit discourse relation recognition (DRR) for Chinese, while there are some available discourse corpora in other languages, such as English. In this paper, we propose a cross-lingual implicit DRR framework that exploits an available English corpus for the Chinese DRR task. We use machine translation to generate Chinese instances from a labeled English discourse corpus. In this way, each instance has two independent views: Chinese and English views. Then we train two classifiers in Chinese and English in a co-training way, which exploits unlabeled Chinese data to implement better implicit DRR for Chinese. Experimental results demonstrate the effectiveness of our method.