多模态与文本预训练模型的文本嵌入差异研究

Difference between Multi-modal vs.Text Pre-trained Models in Embedding Text

下载PDF

导出

摘要为了详细地分析文本单模态预训练模型RoBERTa和图文多模态预训练模型WenLan文本嵌入的差异,提出两种定量比较方法,即在任一空间中,使用距离一个词最近的k近邻词集合表示其语义,进而通过集合间的Jaccard相似度来分析两个空间中词的语义变化;将每个词与其k近邻词组成词对,分析词对之间的关系。实验结果表明,图文多模态预训练为更抽象的词(如成功和爱情等)带来更多的语义变化,可以更好地区分反义词,发现更多的上下义词,而文本单模态预训练模型更擅长发现同义词。另外,图文多模态预训练模型能够建立更广泛的词之间的相关关系。 This paper provides quantitative comparison between the text embedding of a text pre-trained model(i.e.,RoBERTa)and a multi-modal pre-trained model(i.e.,WenLan).Two quantitative comparison methods are proposed,in an embedding space:representing the semantics of a word using the set of􀝇-nearest words to it and then analyze the semantic changes of the word in the two spaces using the Jaccard similarity of the two sets;forming pairs between each word and its nearest􀝇words to analyze the relationship.The results show that the multi-modal pre-training brings more semantic changes for more abstract words(e.g.,success,love),and the multimodal pre-trained model can better differentiate antonyms and discover more hypernyms or hyponyms,while text pre-training works better in finding synonyms.Moreover,multi-modal pre-trained model can construct a more extensive associative relationship between words.

作者孙宇冲程曦苇宋睿华车万翔卢志武文继荣 SUN Yuchong;CHENG Xiwei;SONG Ruihua;CHE Wanxiang;LU Zhiwu;WEN Jirong(Gaoling School of Artificial Intelligence,Renmin University of China,Beijing 100872;School of Statistics,Renmin University of China,Beijing 100872;Beijing Academy of Artificial Intelligence,Beijing 100084;Faculty of Computing,Harbin Institute of Technology,Harbin 150001)

机构地区中国人民大学高瓴人工智能学院中国人民大学统计学院北京智源人工智能研究院哈尔滨工业大学计算学部

出处《北京大学学报（自然科学版）》 EI CAS CSCD 北大核心 2023年第1期48-56,共9页 Acta Scientiarum Naturalium Universitatis Pekinensis

基金北京高校卓越青年科学家计划(BJJWZYJH012019100020098)资助。

关键词多模态预训练文本表示文本嵌入分析 multi-modal pre-training text representation text embedding analysis

分类号 TP391.1 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

1樊其锋,黑继伟,吕闯,庞敏,尚喆,夏云龙,邢志钢.基于行为曲线的用户协同过滤控制推荐[J].家电科技,2022(6):98-102.
2张亚娟,孙如浩,张汝峰,滕岳,张喜英.基于Python的新浪微博舆情监控系统设计[J].计算机应用文摘,2023,39(2):45-48.
3胡彦.《世说新语》“人种”解[J].汉字文化,2022(18):105-107.
4李金星,靳茜,管廷贤,李荣才,周多林,道尔洪·毕亚克,布仁代,张希永,任金龙,赵莉.新疆博尔塔拉蒙古自治州草原蝗虫多样性及群落结构特征[J].中国生物防治学报,2022,38(5):1213-1222. 被引量：3
5Umarani Balakrishnan.NDC-IVM:An automatic segmentation of optic disc and cup region from medical images for glaucoma detection[J].Journal of Innovative Optical Health Sciences,2017(3):118-132. 被引量：1
6ZHOU Jian,YANG Fei-ling,ZHONG Zi-jie,ZHANG Ji,LENG Xian,YE Jin,WU Rui-dong.Surrogacy of bird species in systematic conservation planning and conservation assessments in Yunnan Province,China[J].Journal of Mountain Science,2022,19(10):2861-2873.

北京大学学报（自然科学版）

2023年第1期

浏览历史

内容加载中请稍等...

多模态与文本预训练模型的文本嵌入差异研究

相关作者

相关机构

相关主题

浏览历史