The discourse analysis task,which focuses on understanding the semantics of long text spans,has received increasing attention in recent years.As a critical component of discourse analysis,discourse relation recognitio...The discourse analysis task,which focuses on understanding the semantics of long text spans,has received increasing attention in recent years.As a critical component of discourse analysis,discourse relation recognition aims to identify the rhetorical relations between adjacent discourse units(e.g.,clauses,sentences,and sentence groups),called arguments,in a document.Previous works focused on capturing the semantic interactions between arguments to recognize their discourse relations,ignoring important textual information in the surrounding contexts.However,in many cases,more than capturing semantic interactions from the texts of the two arguments are needed to identify their rhetorical relations,requiring mining more contextual clues.In this paper,we propose a method to convert the RST-style discourse trees in the training set into dependency-based trees and train a contextual evidence selector on these transformed structures.In this way,the selector can learn the ability to automatically pick critical textual information from the context(i.e.,as evidence)for arguments to assist in discriminating their relations.Then we encode the arguments concatenated with corresponding evidence to obtain the enhanced argument representations.Finally,we combine original and enhanced argument representations to recognize their relations.In addition,we introduce auxiliary tasks to guide the training of the evidence selector to strengthen its selection ability.The experimental results on the Chinese CDTB dataset show that our method outperforms several state-of-the-art baselines in both micro and macro F1 scores.展开更多
A lack of labeled corpora obstructs the research progress on implicit discourse relation recognition (DRR) for Chinese, while there are some available discourse corpora in other languages, such as English. In this p...A lack of labeled corpora obstructs the research progress on implicit discourse relation recognition (DRR) for Chinese, while there are some available discourse corpora in other languages, such as English. In this paper, we propose a cross-lingual implicit DRR framework that exploits an available English corpus for the Chinese DRR task. We use machine translation to generate Chinese instances from a labeled English discourse corpus. In this way, each instance has two independent views: Chinese and English views. Then we train two classifiers in Chinese and English in a co-training way, which exploits unlabeled Chinese data to implement better implicit DRR for Chinese. Experimental results demonstrate the effectiveness of our method.展开更多
基金supported by the National Natural Science Foundation of China(Grant Nos.61836007,61773276)the Priority Academic Program Development(PAPD)of Jiangsu Higher Education Institutions.
文摘The discourse analysis task,which focuses on understanding the semantics of long text spans,has received increasing attention in recent years.As a critical component of discourse analysis,discourse relation recognition aims to identify the rhetorical relations between adjacent discourse units(e.g.,clauses,sentences,and sentence groups),called arguments,in a document.Previous works focused on capturing the semantic interactions between arguments to recognize their discourse relations,ignoring important textual information in the surrounding contexts.However,in many cases,more than capturing semantic interactions from the texts of the two arguments are needed to identify their rhetorical relations,requiring mining more contextual clues.In this paper,we propose a method to convert the RST-style discourse trees in the training set into dependency-based trees and train a contextual evidence selector on these transformed structures.In this way,the selector can learn the ability to automatically pick critical textual information from the context(i.e.,as evidence)for arguments to assist in discriminating their relations.Then we encode the arguments concatenated with corresponding evidence to obtain the enhanced argument representations.Finally,we combine original and enhanced argument representations to recognize their relations.In addition,we introduce auxiliary tasks to guide the training of the evidence selector to strengthen its selection ability.The experimental results on the Chinese CDTB dataset show that our method outperforms several state-of-the-art baselines in both micro and macro F1 scores.
基金Project supported by the National Natural Science Foundation of China(No.61672440)the Natural Science Foundation of Fujian Province,China(No.2016J05161)+2 种基金the Research Fund of the State Key Laboratory for Novel Software Technology in Nanjing University,China(No.KFKT2015B11)the Scientific Research Project of the National Language Committee of China(No.YB135-49)the Fundamental Research Funds for the Central Universities,China(No.ZK1024)
文摘A lack of labeled corpora obstructs the research progress on implicit discourse relation recognition (DRR) for Chinese, while there are some available discourse corpora in other languages, such as English. In this paper, we propose a cross-lingual implicit DRR framework that exploits an available English corpus for the Chinese DRR task. We use machine translation to generate Chinese instances from a labeled English discourse corpus. In this way, each instance has two independent views: Chinese and English views. Then we train two classifiers in Chinese and English in a co-training way, which exploits unlabeled Chinese data to implement better implicit DRR for Chinese. Experimental results demonstrate the effectiveness of our method.