基于语义相似性的跨模态图文内容筛选存储机制研究

Content Sifting Storage Mechanism for Cross-Modal Image and Text Data Based on Semantic Similarity

下载PDF

导出

摘要随着多媒体数据的爆发式增长,云端数据呈现出大规模多模态混合并存的特性.服务于数据分析的传统存储系统因为缺乏数据的语义管理而面临读取延时超长的挑战.针对图像和文本2种模态数据,在传统存储系统之上提出一种跨模态图文数据内容筛选存储机制(cross-modal image and text content sifting storage,CITCSS),用于提供大规模在线相似性内容筛选服务,从存储系统层面缓解数据分析时必须从存储中读出所有数据的读带宽压力.机制分为离线与在线2个阶段.离线阶段中,引入基于自监督的生成对抗式Hash方法,系统利用这种方法生成语义元数据.然后,将元数据注入独立的元数据空间.最后,根据相似性Hash码间汉明距离能够度量语义距离的特点,利用Neo4j图数据库构建Hash元数据图谱,并在语义图谱中建立Hash码与存储路径之间的映射.在线阶段中,用户发送与分析相关的图像或文本,存储系统首先转化数据为Hash码.然后,在筛选半径内通过Hash元数据图谱寻找相似节点,进而找到相似文件的底层存储路径返回筛选数据.实验结果表明,与传统语义存储系统相比,CITCSS在召回率超过98%的性能下,读取延迟相对降低了99.07%~99.77%. With the explosive growth of multimedia data,the data in cloud becomes heterogeneous and large.The conventional storage systems served for data analysis face the challenge of long read latency due to the lack of semantic management of data.To solve this problem,a cross-modal image and text content sifting storage(CITCSS)mechanism is proposed,which saves the read bandwidth by only reading relevant data.The mechanism consists of the off-line and on-line stages.In the off-line stage,the system first uses the self-supervised adversarial Hash learning algorithm to learn and map the stored data to similar Hash codes.Then,these Hash codes are connected by Hamming distances and managed by the metadata style.In the implement,we use Neo4j to construct the semantic Hash code graph.Furthermore,we insert storage paths into the property of node to accelerate reading.In the on-line stage,our mechanism first maps the image or text represented the analysis requirement into Hash codes and sends them to the semantic Hash code graph.Then,the relevant data will be found by the sifting radius on the graph,and returned to the user finally.Benefiting from our mechanism,storage systems can perceive and manage semantic information resulting in advance service for analysis.Experimental results on public cross-modal datasets show that CITCSS can greatly reduce the read latency by 99.07%to 99.77%with more than 98%recall rate compared with conventional semantic storage systems.

作者刘渝郭婵冯树耀周可肖志立 Liu Yu;Guo Chan;Feng Shuyao;Zhou Ke;Xiao Zhili(Wuhan National Laboratory for Optoelectronics,Huazhong University of Science and Technology,Wuhan 430074;Technology and Engineering Group,Tencent Inc.,Shenzhen,Guangdong 518054)

机构地区华中科技大学武汉光电国家研究中心深圳市腾讯计算机系统有限公司技术工程事业群

出处《计算机研究与发展》 EI CSCD 北大核心 2021年第2期338-355,共18页 Journal of Computer Research and Development

基金国家自然科学基金青年科学基金项目(61902135) 国家自然科学基金创新群体项目(61821003)。

关键词语义管理 Hash码元数据元数据图谱存储机制读带宽 semantic management Hash code metadata metadata graph storage mechanism read bandwidth

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献1

1卫敏,余乐安.具有最优学习率的RBF神经网络及其应用[J].管理科学学报,2012,15(4):50-57. 被引量：50

二级参考文献22

1Bouchoux S,Brost V. Implementation of pattern recognition algorithm based on RBF neural network[A].2002.125-135,468.
2Zhao Z Q,Huang D S,Sun B Y. Human face recognition based on multi-features using neural networks committee[J].Pattern Recognition Letters,2004,(12):1351-1358.
3Haddadnia J,Faez K,Ahmadi M. A fuzzy hybrid learning algorithm for radial basis function neural network with application in human face recognition[J].Pattern Recognition,2003,(05):1187-1202.
4Li M,Tian J,Chen F. Improving multiclass pattern recognition with a co-evolutionary RBFNN[J].Pattern Recognition Letters,2008,(04):392-406.
5Lucks M B,Oki N. A radial basis function network (RBFN) for function approximation[A].1999.1099-1101,1150.
6Lu Y,Sundararajan N,Saratchandran P. A sequential learning scheme for function approximation using minimal radial basis function neural networks[J].Neural Computation,1997,(02):461-478.
7Park J,Sandberg I W. Universal approximation using radial-basis-function networks[J].Neural Computation,1991,(02):246-257.doi:10.1162/neco.1991.3.2.246.
8Park J,Sandberg I W. Approximation and radial-basis-function networks[J].Neural Computation,1993,(02):305-316.doi:10.1162/neco.1993.5.2.305.
9Yu L,Lai K K. Multistage RBF neural network ensemble learning for exchange rates forecasting[J].Neurocomputing,2008,(16-18):3295-3302.
10Chen S,Cowan C F N,Grant P M. Orthogonal least squares learning algorithm for radial basis function networks[J].IEEE Transactions on Neural Networks,1991,(02):302-309.doi:10.1109/72.80341.

共引文献49

1郭裕祺,朱大令,何心.基于自编码器的调压器在线故障诊断方法[J].煤气与热力,2020,0(1):20-23. 被引量：3
2秦硕,吴文林,何萌,候智强,韦永金.补偿模糊神经网络在埋地管道重构的应用[J].煤气与热力,2020,40(1):1-5.
3王辅之,罗爱静,孙伟伟,谢文照.基于AHP-RBF神经网络的居民健康信息素养评价模型研究[J].医学信息学杂志,2013,34(7):14-18. 被引量：13
4罗宜武,徐鸿宇.基于RBF网络的公路工程项目辅助投标报价决策模型[J].湖南交通科技,2013,39(3):189-193. 被引量：1
5靖永志,肖建.高速磁浮车间隙传感器非线性校正方法研究[J].计算机测量与控制,2013,21(12):3340-3342. 被引量：2
6闫少华,吴奇.基于自组织神经网络的空中交通复杂度参数优化及预测[J].中国民航飞行学院学报,2014,25(1):42-45. 被引量：1
7邵必林,王立轻,吴琼.工程项目工期隐性成本控制模型研究[J].建筑经济,2014,35(1):41-44. 被引量：5
8王竹君,李婷玉,邢英梅,叶汇元.基于曲线拟合文化式原型袖窿和袖山结构研究[J].西安工程大学学报,2014,28(6):704-708. 被引量：4
9何九冉,四兵锋.EMD-RBF组合模型在城市轨道交通客流预测中的应用[J].铁道运输与经济,2014,36(10):87-92. 被引量：12
10张恩瑜,王珏,张奇,郑永和,汪寿阳.国家自然科学基金资助项目综合评价:基于Vague集多准则决策[J].管理科学学报,2015,18(2):76-84. 被引量：12

1季文飞,蒋同海,王蒙,唐新余,陈光.基于语义元数据的医养数据融合研究与实现[J].计算机应用与软件,2020,37(5):38-43. 被引量：3
2张伟.初中数学核心素养的渗透与学生核心素养养成[J].数学学习与研究,2021(1):110-112.
3杨玲,邹鹏,张永锋.图文气象节目自动化模块化可视化开发运用新探[J].湖南大众传媒职业技术学院学报,2020,20(3):18-20. 被引量：1
4闫军.双轮驱动云计算提升支付清算行业创新活力[J].金融电子化,2021(1):92-93.
5王建树,孟荣,王亚强,周玲,袁龙,王昭雷.基于泛载物联网的电力自动化智能无线通信网络加密流量识别[J].微型电脑应用,2021,37(1):140-142. 被引量：17
6FU Jun-jie,TANG Xiao-xiang.Naturalistic,Harmonious,and Emotional:An Aesthetic Study of Plants of Chinese Gardens in the Qing Dynasty[J].Journal of Literature and Art Studies,2019,9(12):1326-1332.
7Juan WU,Xiangluan WAN,Dingrong WAN,Yujie CHEN,Rui PU.Effects of Fertilization and Transplantation in the Neighbouring Area on Quality of Artemisia argyi Leaf[J].Medicinal Plant,2020,11(3):65-67.
8何小波,焦石.基于相位调制技术的可见光通信系统码间干扰识别研究[J].激光杂志,2021,42(1):144-148. 被引量：3
9Tsering Dondrup(Text/Photo).A CREATIVE HERDER AND HIS HIGHLAND BARLEY PRODUCTION LINE[J].China's Tibet,2020,31(5):58-60.
10吴夏.大数据背景下数据存储加密技术研究[J].信息与电脑,2020,32(24):136-138. 被引量：4

计算机研究与发展

2021年第2期

浏览历史

内容加载中请稍等...

基于语义相似性的跨模态图文内容筛选存储机制研究

参考文献1

二级参考文献22

共引文献49

相关作者

相关机构

相关主题

浏览历史