摘要
【目的】提出一种单分类器联合多任务网络的隐式句间关系分析方法,即基于单分类器的多任务学习模型进行中文隐式句间关系识别。【方法】多任务学习方法通过对隐式句间关系和显式句间关系进行联合建模而获得更好的结果;而单分类器是通过将四分类问题转换为二分类问题进行训练而获取结果。【结果】基于哈尔滨工业大学的中文篇章级语义关系语料库,在扩展关系和并列关系的语料中F1值分别达到0.94和0.81,在4种句间关系的F1值上均取得显著提升。【局限】模型效果还可进一步提升,数据集分布不够均衡且有待扩充。【结论】在哈尔滨工业大学的中文篇章级语义关系语料库上,所提方法取得了超过业界已知最佳结果的性能,同时也验证了删除连接词会给训练集增加噪声并影响性能。
[Objective]This paper proposes a new method to identify implicit discourse relations based on a single classifier and multi-task learning model.[Methods]First,we modeled the implicit and explicit discourse relationships with the multi-task learning method.Then,we converted the four classification problems to two and trained the single classifier.[Results]We examined our new method with the HIT-CDTB data set.For the corpus with extended and parallel relations,the F1 values reached 0.94 and 0.81 respectively,which were significantly improved with four inter-sentence relations.[Limitations]The performance of our model could be improved with more distributed and expanded datasets.[Conclusions]The proposed method yields the best results with the HITCDTB data set.Deleting connectives will add noise to the training set and negatively affect the model’s performance.
作者
王鸿
舒展
高印权
田文洪
Wang Hong;Shu Zhan;Gao Yinquan;Tian Wenhong(School of Information and Software Engineering,University of Electronic Science and Technology of China,Chengdu 610054,China;Yangtze Delta Region Institute of University of Electronic Science and Technology of China,Huzhou 313001,China)
出处
《数据分析与知识发现》
CSSCI
CSCD
北大核心
2021年第11期80-88,共9页
Data Analysis and Knowledge Discovery
基金
科技部重点研发计划(项目编号:2018AAA0103203)的研究成果之一。
关键词
单分类器
多任务网络
隐式句间关系
Single Classifier
Multi-Task Network
Implicit dDiscourse Relation