摘要
Discourse relation classification is a fundamental task for discourse analysis,which is essential for understanding the structure and connection of texts.Implicit discourse relation classification aims to determine the relationship between adjacent sentences and is very challenging because it lacks explicit discourse connectives as linguistic cues and sufficient annotated training data.In this paper,we propose a discriminative instance selection method to construct synthetic implicit discourse relation data from easy-to-collect explicit discourse relations.An expanded instance consists of an argument pair and its sense label.We introduce the argument pair type classification task,which aims to distinguish between implicit and explicit argument pairs and select the explicit argument pairs that are most similar to natural implicit argument pairs for data expansion.We also propose a simple label-smoothing technique to assign robust sense labels for the selected argument pairs.We evaluate our method on PDTB 2.0 and PDTB 3.0.The results show that our method can consistently improve the performance of the baseline model,and achieve competitive results with the state-of-the-art models.
基金
National Natural Science Foundation of China(Grant Nos.62376166,62306188,61876113)
National Key R&D Program of China(No.2022YFC3303504).