期刊文献+

Visual Topic Semantic Enhanced Machine Translation for Multi-Modal Data Efficiency

原文传递
导出
摘要 The scarcity of bilingual parallel corpus imposes limitations on exploiting the state-of-the-art supervised translation technology.One of the research directions is employing relations among multi-modal data to enhance perfor-mance.However,the reliance on manually annotated multi-modal datasets results in a high cost of data labeling.In this paper,the topic semantics of images is proposed to alleviate the above problem.First,topic-related images can be auto-matically collected from the Internet by search engines.Second,topic semantics is sufficient to encode the relations be-tween multi-modal data such as texts and images.Specifically,we propose a visual topic semantic enhanced translation(VTSE)model that utilizes topic-related images to construct a cross-lingual and cross-modal semantic space,allowing the VTSE model to simultaneously integrate the syntactic structure and semantic features.In the above process,topic similar texts and images are wrapped into groups so that the model can extract more robust topic semantics from a set of similar images and then further optimize the feature integration.The results show that our model outperforms competitive base-lines by a large margin on the Multi30k and the Ambiguous COCO datasets.Our model can use external images to bring gains to translation,improving data efficiency.
作者 王超 蔡思佳 史北祥 崇志宏 Chao Wang;Si-Jia Cai;Bei-Xiang Shi;Zhi-Hong Chong(School of Computer Science and Engineering,Southeast University,Nanjing 210096,China;School of Architecture,Southeast University,Nanjing 210096,China)
出处 《Journal of Computer Science & Technology》 SCIE EI CSCD 2023年第6期1223-1236,共14页 计算机科学技术学报(英文版)
基金 supported by the National Natural Science Foundation of China under Grant No.52178034.
  • 相关文献

参考文献1

二级参考文献1

共引文献169

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部