期刊文献+

基于图文多模态门控增强的文本平行句对抽取方法

Image-Text Multimodal Gating Enhanced Parallel Corpus Filtering
下载PDF
导出
摘要 目前,主流平行句对抽取方法大都采用预训练模型加微调的策略,并基于句子语义相似性实现平行句对的抽取。但该方法对词级实体级等细粒度的对齐考虑不足,使得获取的平行句对在词粒度上存在噪声,影响了平行句对的质量。图像是一种语言无关的模态,可以跨越语言之间的语义鸿沟,且包括丰富的词级或实体信息。本文以图像模态为双语对齐的锚点,基于多模态门控增强,实现图像模态信息在双语表征端的自适应融合,最终实现平行句对的判别。本文所提的方法无需提前进行图像和文本的对齐标注。首先,从预构建好的图像数据库中基于词级或实体粒度对齐抽取源语言和目标语言的相关的图像模态信息;其次,基于图文多模态门控的方式分别实现源语言和目标语言图文信息的融合,获得图像增强后的文本语义表征;最后,将双语表征信息进行融合,实现平行句对抽取。所提方法在英语-越南语、英语-德语双语平行句对抽取任务上进行了实验,证明了融合图像信息对文本平行句对抽取的有效性。 At present, most of the mainstream parallel sentence pair extraction methods use the strategy of pre training model and finetuning, and realize parallel sentence pair extraction based on sentence semantic similarity. However, this method does not consider the fine-grained alignment of word level and entity level, which makes the parallel sentence pairs noisy in word granularity and affects the quality of parallel sentence pairs. Image is a language independent mode, which can bridge the semantic gap between languages and include rich word level or entity information. In this paper, the image modality is used as the anchor point of bilingual alignment. Based on multi-modal gating enhancement, the adaptive fusion of image modality information at the bilingual representation end is realized,and finally the discrimination of parallel sentence pairs is realized. The proposed method does not need to align and label the image and text in advance. Firstly, the image modal information of source language and target language is extracted from the pre constructed image database based on word level or entity granularity alignment;Secondly, based on multi-modal gating, the text information fusion of source language and target language is realized respectively, and the text semantic representation after image enhancement is obtained;Finally, the bilingual representation information is fused to extract parallel sentence pairs. The proposed method has been tested on the English Vietnamese and English German bilingual parallel sentence pair extraction tasks, which proves the effectiveness of the fusion of image information for text parallel sentence pair extraction.
作者 霍茜曈 HUO Xitong(Facully of Informalion Engineering and Aulomalion Kunming Unitersily of Science and Technology,Kunming 650500,China)
出处 《电视技术》 2022年第6期46-53,57,共9页 Video Engineering
关键词 平行句对抽取 图文模态门控 神经网络 信息增强 parallel corpus filtering image-text multimodal gating neural network information enhancement
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部