摘要
Although neural approaches have yielded state-of-the-art results in the sentence matching task,their perfor-mance inevitably drops dramatically when applied to unseen domains.To tackle this cross-domain challenge,we address unsupervised domain adaptation on sentence matching,in which the goal is to have good performance on a target domain with only unlabeled target domain data as well as labeled source domain data.Specifically,we propose to perform self-su-pervised tasks to achieve it.Different from previous unsupervised domain adaptation methods,self-supervision can not on-ly flexibly suit the characteristics of sentence matching with a special design,but also be much easier to optimize.When training,each self-supervised task is performed on both domains simultaneously in an easy-to-hard curriculum,which gradually brings the two domains closer together along the direction relevant to the task.As a result,the classifier trained on the source domain is able to generalize to the unlabeled target domain.In total,we present three types of self-super-vised tasks and the results demonstrate their superiority.In addition,we further study the performance of different usages of self-supervised tasks,which would inspire how to effectively utilize self-supervision for cross-domain scenarios.
作者
白桂荣
刘庆斌
何世柱
刘康
赵军
Gui-Rong Bai;Qing-Bin Liu;Shi-Zhu He;Kang Liu;Jun Zhao(National Laboratory of Pattern Recognition,Institute of Automation,Chinese Academy of Sciences,Beijing 100190,China;School of Artificial Intelligence,University of Chinese Academy of Sciences,Beijing 100049,China)
基金
supported by the National Natural Science Foundation of China under Grant Nos.61922085 and 61976211
the National Key Research and Development Program of China under Grant No.2020AAA0106400
the Key Research Program of the Chinese Academy of Sciences under Grant No.ZDBS-SSW-JSC006
the Independent Research Project of the National Laboratory of Pattern Recognition under Grant No.Z-2018013
the Youth Innovation Promotion Association of Chinese Academy of Sciences under Grant No.2020138.