期刊文献+

多模态与文本预训练模型的文本嵌入差异研究

Difference between Multi-modal vs.Text Pre-trained Models in Embedding Text
下载PDF
导出
摘要 为了详细地分析文本单模态预训练模型RoBERTa和图文多模态预训练模型WenLan文本嵌入的差异,提出两种定量比较方法,即在任一空间中,使用距离一个词最近的k近邻词集合表示其语义,进而通过集合间的Jaccard相似度来分析两个空间中词的语义变化;将每个词与其k近邻词组成词对,分析词对之间的关系。实验结果表明,图文多模态预训练为更抽象的词(如成功和爱情等)带来更多的语义变化,可以更好地区分反义词,发现更多的上下义词,而文本单模态预训练模型更擅长发现同义词。另外,图文多模态预训练模型能够建立更广泛的词之间的相关关系。 This paper provides quantitative comparison between the text embedding of a text pre-trained model(i.e.,RoBERTa)and a multi-modal pre-trained model(i.e.,WenLan).Two quantitative comparison methods are proposed,in an embedding space:representing the semantics of a word using the set of􀝇-nearest words to it and then analyze the semantic changes of the word in the two spaces using the Jaccard similarity of the two sets;forming pairs between each word and its nearest􀝇words to analyze the relationship.The results show that the multi-modal pre-training brings more semantic changes for more abstract words(e.g.,success,love),and the multimodal pre-trained model can better differentiate antonyms and discover more hypernyms or hyponyms,while text pre-training works better in finding synonyms.Moreover,multi-modal pre-trained model can construct a more extensive associative relationship between words.
作者 孙宇冲 程曦苇 宋睿华 车万翔 卢志武 文继荣 SUN Yuchong;CHENG Xiwei;SONG Ruihua;CHE Wanxiang;LU Zhiwu;WEN Jirong(Gaoling School of Artificial Intelligence,Renmin University of China,Beijing 100872;School of Statistics,Renmin University of China,Beijing 100872;Beijing Academy of Artificial Intelligence,Beijing 100084;Faculty of Computing,Harbin Institute of Technology,Harbin 150001)
出处 《北京大学学报(自然科学版)》 EI CAS CSCD 北大核心 2023年第1期48-56,共9页 Acta Scientiarum Naturalium Universitatis Pekinensis
基金 北京高校卓越青年科学家计划(BJJWZYJH012019100020098)资助。
关键词 多模态预训练 文本表示 文本嵌入分析 multi-modal pre-training text representation text embedding analysis
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部