期刊文献+

眼动—语言跨模态共指消解方法

Object referring with language and human gaze
下载PDF
导出
摘要 跨模态共指消解是根据人员交互意图对自然图像中所指目标进行定位,作为智能人机交互领域的关键技术之一,能够应用于抢险救灾、家庭服务或养老助残等场景.现有的目标指代方法一般采用单模态信息表现人类意图,例如语言或者眼动等,然而单一的模态用户输入只能够传达有限的交互信息,难以实现自然而智能的人机协同.本文针对这一问题,同时融合眼动和语言信息,建立了跨模态共指消解模型,利用多种模态信息的优势互补,实现人类意图所指目标的图像定位任务;设计了对比试验,验证了本文提出的眼动—语言跨模态的融合方法性能优于单模态的输入形式. Object referring is a task to locate the target in the image according to human intention.As one of the key technologies of intelligent human-computer interaction,it can be applied to scenarios such as emergency rescue and disaster relief,family service or providing for the disabled.The existing works of object referring generally use single-modal information to express human intention,such as language or gaze,etc.However,a single modal can only convey limited information,it is difficult to perform natural and intelligent human-computer collaboration.In order to solve this problem,we propose a method to achieve object referring with language and human gaze,utilizing the advantages of multiple modals to realize localization of the target referred to by human intention.Comparative experiments are designed to verify that the performance of the gaze-language cross-modal object referring method proposed in this paper outperforms that of the single-modal input method.
作者 张珺倩 宋明武 谢良 张亚坤 印二威 闫野 ZHANG Junqian;SONG Mingwu;XIE Liang;ZHANG Yakun;YIN Erwei;YAN Ye(Academy of Medical Engineering and Translational Medicine,Tianjn University,Tianjn 300072,China;National Innovation Institute of Defense Technology,Academy of Military Sciences,Beijing 100071,China;Tianjin Artificial Intelligence Innovation Center,Tianjin 300450,China)
出处 《智能安全》 2022年第1期89-95,共7页
关键词 深度学习 跨模态 目标定位 眼动 自然语言处理 deep learning multi-modal localization gaze natural language processing
  • 相关文献

参考文献1

二级参考文献15

  • 1黄昌宁.统计语言模型能做什么?[J].语言文字应用,2002(1):77-84. 被引量:31
  • 2高小宇,高庆狮,胡玥,李莉.基于语义单元表示树剪枝的高速多语言机器翻译[J].软件学报,2005,16(11):1909-1919. 被引量:13
  • 3俞士汶 吕叔湘 马庆株.自然语言理解与语法研究[A].吕叔湘,马庆株.语法研究入门[C].北京:商务印书馆,1999.240-251.
  • 4Winograd T.Language as a Cognitive Process[M].MA:Addison Wesley,1983.
  • 5Schank R C.Conceptual Information Processing[M].Amsterdam:North-Holland Publishng Company,1975.
  • 6Miller G A,Beckwith R,Fellbaum C,et al.Introduction to WordNet:An On-Line Lexical Database[J].International Journal of Lexicography,1990,3(4):234-244.
  • 7黄曾阳.HNC理论[M].北京:清华大学出版社,1998.
  • 8董振东.知网[EB].http://www.keenage,com.2006-02.
  • 9陆汝钤.世纪之交的知识工程与科学[M].北京:清华大学出版社,2001.
  • 10Schank R C,Abelson R.Scripts,Plans,Goals,and Understanding[M].NJ:Lawrence Erlbaum,1977.

共引文献10

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部