期刊文献+

基于Mapreduce的多源多模态大数据检索方法研究 被引量:10

Research on Multi-Source and Multi-Modal Big Data Retrieval Method Based on MapReduce
下载PDF
导出
摘要 在网络数据量增长的同时,也引发了数据源与数据模态的增长,由此导致在数据检索时将面临着语义区分与海量数据两大难题,为此提出并设计了基于的多源多模态大数据检索方法。针对语义一致性区分,引入哈希与字典学习构建目标评价,为尽可能保证特征对在低维空间内的近似性,采取矩阵分解结合中间变量方式,将目标评价的求解转化为非凸问题。同时,利用网络学习来获取多源多模态数据间存在的彼此联系,将层结构设计为节点邻接矩阵形式,并根据近似度计算得出多模态检索结果。针对海量数据,引入处理框架,将检索任务分割成轻量化的子任务,并将算法部署于与中,实现分布式并行处理。基于多源多模态大数据检索质量分析,通过仿真证明了所提方法对于多源多模态数据检索具有良好的准确性和完整性,且显著提升了大数据的检索效率。 With the increase of network data, it also leads to the increase of data sources and data modes, as a result, data retrieval will be faced with two major problems: semantic differentiation and massive data. Therefore, a multi-source and multi-modal big data retrieval method based on MapReduce was proposed and designed. Aiming at semantic consistency distinction, hash and dictionary learning were introduced to build objective evaluation. In order to ensure the approximation of feature pair in low dimensional space as much as possible, matrix decomposition and intermediate variable method were adopted to transform the solution of objective evaluation into non convex problem. At the same time, network learning was used to obtain the relationship between multi-source and multi-modal data. The layer structure was designed as the form of node adjacency matrix, and the multi-modal retrieval results were obtained according to the approximation calculation. For massive data, MapReduce processing framework was introduced to divide retrieval tasks into lightweight sub tasks, and the algorithm was deployed in maptask and reducetask to realize distributed parallel processing. Based on the quality analysis of multi-source and multi-modal big data retrieval, the simulation results show that the proposed method has good accuracy and integrity for multi-source and multi-modal data retrieval, and significantly improves the efficiency of big data retrieval.
作者 魏秀卓 赵慧南 WEI Xiu-zhuo;ZHAO Hui-nan(College of Humanities&Sciences of Northeast Normal University,Jilin Changchun 130000,China)
出处 《计算机仿真》 北大核心 2021年第4期422-426,共5页 Computer Simulation
基金 基于大数据在高校学生管理中的应用研究(2018001)。
关键词 多源多模态 哈希算法 目标评价函数 近似度 大数据检索 Multi source and multimode Hash algorithm Objective evaluation function Approximation Big data retrieval
  • 相关文献

参考文献8

二级参考文献22

共引文献126

同被引文献92

引证文献10

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部