摘要
为了详细地分析文本单模态预训练模型RoBERTa和图文多模态预训练模型WenLan文本嵌入的差异,提出两种定量比较方法,即在任一空间中,使用距离一个词最近的k近邻词集合表示其语义,进而通过集合间的Jaccard相似度来分析两个空间中词的语义变化;将每个词与其k近邻词组成词对,分析词对之间的关系。实验结果表明,图文多模态预训练为更抽象的词(如成功和爱情等)带来更多的语义变化,可以更好地区分反义词,发现更多的上下义词,而文本单模态预训练模型更擅长发现同义词。另外,图文多模态预训练模型能够建立更广泛的词之间的相关关系。
This paper provides quantitative comparison between the text embedding of a text pre-trained model(i.e.,RoBERTa)and a multi-modal pre-trained model(i.e.,WenLan).Two quantitative comparison methods are proposed,in an embedding space:representing the semantics of a word using the set of-nearest words to it and then analyze the semantic changes of the word in the two spaces using the Jaccard similarity of the two sets;forming pairs between each word and its nearestwords to analyze the relationship.The results show that the multi-modal pre-training brings more semantic changes for more abstract words(e.g.,success,love),and the multimodal pre-trained model can better differentiate antonyms and discover more hypernyms or hyponyms,while text pre-training works better in finding synonyms.Moreover,multi-modal pre-trained model can construct a more extensive associative relationship between words.
作者
孙宇冲
程曦苇
宋睿华
车万翔
卢志武
文继荣
SUN Yuchong;CHENG Xiwei;SONG Ruihua;CHE Wanxiang;LU Zhiwu;WEN Jirong(Gaoling School of Artificial Intelligence,Renmin University of China,Beijing 100872;School of Statistics,Renmin University of China,Beijing 100872;Beijing Academy of Artificial Intelligence,Beijing 100084;Faculty of Computing,Harbin Institute of Technology,Harbin 150001)
出处
《北京大学学报(自然科学版)》
EI
CAS
CSCD
北大核心
2023年第1期48-56,共9页
Acta Scientiarum Naturalium Universitatis Pekinensis
基金
北京高校卓越青年科学家计划(BJJWZYJH012019100020098)资助。
关键词
多模态预训练
文本表示
文本嵌入分析
multi-modal pre-training
text representation
text embedding analysis