期刊文献+

多模态知识图谱增强葡萄种植问答对的答案选择模型 被引量:4

Enhancing answer selection model of grape planting using multimodal knowledge graph
下载PDF
导出
摘要 针对传统答案选择模型仅依靠问答对自身信息进行匹配的问题,该研究提出了一种使用多模态知识图谱来增强问答对的答案选择模型。该模型通过设计基于ComplEx(complex embedding)图谱嵌入的方法学习多模态知识图谱嵌入,引入上下文注意力机制并使用CNN网络获取多模态知识图谱的特征表示,使用知识感知注意力方法,将多模态知识图谱提供的背景知识与问答对的文本语义信息融合。以葡萄种植为例,通过搭建葡萄种植多模态知识图谱和构造葡萄种植问答数据集开展试验,试验结果表明:使用多模态知识图谱有助于模型获取更多信息从而达到更好的效果,在葡萄问答数据集中正确答案的平均倒数排名和平均准确率分别达到了85.02%、84.21%,与其他模型相比,平均倒数排名提高2.57个百分点,平均准确率提高了3.96个百分点。该答案选择模型利用多模态知识图谱的知识提高答案选择效果,可为搜索、问答等下游任务提供技术基础。 Answer selection is one of the most important tasks during natural language processing in the downstream tasks,such as question-answering systems,and search ranking.The most relevant answer can be selected to the given question from a candidate answer pool,which is usually regarded as a relevance ranking task.However,the current models of answer selection cannot discover the deep semantic relationships between questions and answers using the limited information in the text of the question-answer pairs.Fortunately,knowledge graph can be expected to serve as the background knowledge,in order to enhance the deep semantics of the answer selection model.It is still lacking on the multi-modal background knowledge support,because the answer selection models can rely solely on their own information.In this research,a multi-modal knowledge graph enhanced answer selection model was proposed,including the embedding layer,representation learning layer,knowledge graph enhancement layer,and output layer.Among them,the Glove model was used to obtain the word embeddings for the questionanswer texts in the embedding layer.Furthermore,a ComplEx-based method(complex embedding)was designed to learn the entity embeddings for the multi-modal knowledge graph.The image entity information was considered to extract the image feature representations using the Vision Transformer(VIT).Bi-directional long short-term memory(Bi-LSTM)was used for the representation learning of question-answer texts in the representation layer.The context-guided multi-modal knowledge graph question and answer vector representations were obtained using context-guided attention mechanism.In the knowledge graph enhancement layer,the interaction attention mechanism was used to fuse the semantic representation of the questionanswer texts with the background knowledge features that provided by the multi-modal knowledge graph,particularly for the feature representations of the multi-modal knowledge graph enhanced question and answer.The feature representations of the knowledge graph enhanced question and answer were concatenated with the additional semantic features in the output layer.The softmax function was used to predict the probability distribution of answer labels for a given question.Taking the grape planting as an example,the multi-modal entity linking was realized using the longest common subsequence algorithm.The entity recognition was also implemented to extract the knowledge using the Bert-LSTM-CRF framework and Bert pre-training model.The reference of knowledge graph was collected from the literature and experts.Finally,a multi-modal knowledge graph was constructed in the grape planting field.A grape planting question and answer dataset was also constructed using grape forums,smart agricultural platforms,agricultural managers,and agricultural benefit networks as data sources,followed by text cleaning and dataset expansion.Experimental results show that the better performance of the model was achieved to obtain more information using the multi-modal knowledge graphs.Specifically,the mean reciprocal rank and mean average precision reached 85.02%and 84.21%,respectively,in the grape question answering dataset.The mean reciprocal rank and mean average precision increased by 2.57 and 3.96 percentage points,respectively.The answer selection model with the knowledge of multi-modal knowledge graph can be expected to improve the better performance of answer selection model.The embedding representation with attention mechanism can be utilized to enhance the background knowledge from the multimodal knowledge graph.The finding can provide a technical basis for the downstream applications of multi-modal knowledge graphs,such as the search and question answering.
作者 杨硕 李书琴 YANG Shuo;LI Shuqin(College of Information Engineering,Northwest A&F University,Yangling 712100,China)
出处 《农业工程学报》 EI CAS CSCD 北大核心 2023年第14期207-214,共8页 Transactions of the Chinese Society of Agricultural Engineering
基金 中央高校基本科研业务专项资金(2452019064)。
关键词 农业 知识图谱 葡萄种植 答案选择 多模态 图谱表示 自然语言处理 agriculture knowledge graph grape cultivation answer selection multi-modal graph representation natural language processing(NLP)
  • 相关文献

参考文献9

二级参考文献107

  • 1宋枫溪,高林.文本分类器性能评估指标[J].计算机工程,2004,30(13):107-109. 被引量:33
  • 2向晓雯,史晓东,曾华琳.一个统计与规则相结合的中文命名实体识别系统[J].计算机应用,2005,25(10):2404-2406. 被引量:37
  • 3王甦 汪安圣.认知心理学[M].北京:北京大学出版社,1992..
  • 4A W M Smeulders, et al. Content-based image retrieval at the end of the early years[ J] .IEEE Transactions on Pattern Analysis and Machine Intelligence,2000, 22(12) : 1349 - 1380.
  • 5B B Zhu,M D Swanson, A H Tewfik.When seeing isn't believing[ J] .IEEE Signal Processing Magazine,2004,21 (2):40 - 49.
  • 6H G Schaathun. On watermarking/fingerprinting for copyright protection[ A]. Proc. of 1st International Conference on Innovative Computing, Infonnation and Control (ICICIC) [ C .]. Beijing: IEEE, 2006. (3) :50 - 53.
  • 7J Haitsma, T Kalker. A highly robust audio fingerprinting system[A]. Proc of 3rd International Conference on Music Informarion Retrieval(ISMIR) [ C ]. Paris: IRCAM, 2002.107 - 115.
  • 8P Cano, E Batlle, T Kalker, J Haitsma. A review of audio fingerprinting [ J ]. Journal of VLSI Signal Processing, 2005,41 : 271 - 284.
  • 9H Ozer, B Sankur, N Memon, E Anarim. Perceptual audio hashing functions[ J]. EURASIP Journal on Applied Signal Processing, 2005,12:1780- 1793.
  • 10http://isis. poly. edu/index. php? page = 1&project = 1094.

共引文献199

同被引文献73

引证文献4

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部