摘要
成语完形填空是机器阅读理解(MRC)的一类子任务,旨在测试模型对中文文本中成语的理解和应用能力.针对现有的成语完形填空算法忽视了成语的嵌入向量会出现表征崩溃的现象,并且模型在域外数据上的准确率低,泛化能力较差的问题,本文提出了NeZha-CLofTN.该算法由嵌入层、融合编码层、图注意力子网络和预测层等4部分组成.其中融合编码层中利用对比学习迫使网络改变特征提取的方式,避免了网络输出恒定的嵌入向量,从而预防了表征的崩溃;预测层综合多个近义词图子网络的输出,以获得比其中单独的子网络更好的预测性能,增强模型的泛化能力.NeZha-ClofTN在ChID-Official和ChID-Competition数据集上进行了实验验证,准确率分别达到80.3%和85.3%,并通过消融实验证明了各个模块的有效性.
Idiom cloze test is a subtask in Machine Reading Comprehension(MRC), which aim to test the model’s ability to understand and apply idioms in Chinese text. The existing idiom cloze algorithms ignore the fact that the idiom embeddings suffer from representational collapse, which leads to low accuracy and poor generalization performance on out-of-domain data. In this paper, the authors propose the NeZha-CLofTN, which consists of four parts: embedding layer, fusion coding layer, graph attention subnetwork, and prediction layer. The fusion coding layer uses contrastive learning to force the network to change the feature extraction that avoids the network outputting a constant embedding vector, thus preventing the representational collapse. The prediction layer combines the output of multiple synonym subgraphs to obtain better prediction than a single subgraph and to enhance the generalization performance of the model. NeZha-ClofTN is used in the ChID-Official and ChID-Competition datasets with accuracy of 80.3% and 85.3%, and the effectiveness of each module was demonstrated by ablation experiments.
作者
张本文
黄方怡
琚生根
ZHANG Ben-Wen;HUANG Fang-Yi;JU Sheng-Gen(School of Science and Engineering,Sichuan Minzu College,Kangding 626001,China;College of Computer Science,Sichuan University,Chengdu 610005,China)
出处
《四川大学学报(自然科学版)》
CAS
CSCD
北大核心
2022年第5期54-63,共10页
Journal of Sichuan University(Natural Science Edition)
基金
国家自然科学基金重点项目(62137001)。
关键词
成语完形填空
预训练语言模型
对比学习
近义词
Idiom cloze test
Pre-trained language model
Contrastive learning
Synonym idiom