期刊文献+

基于Neo4j不确定性数据处理技术的研究 被引量:4

Research on Uncertain Data Processing Technology Based on Neo4j Graph Database
下载PDF
导出
摘要 不确定性是数据的本质特征,对不确定性数据的研究得到了越来越多领域的关注。在总结当前处理历史数据不确定性方法的基础上,针对缺乏处理不确定性历史数据的语义框架问题,基于Neo4j图数据库建立用于处理不确定性历史数据的通用数学模型。该模型以双时态模型、概率模型等为依托,整合了历史数据的时间、不确定性与世系三个方面。并基于Python语言实现了具有CRUD基本操作的存储系统,可动态增加节点之间的关系、存储和检索历史数据、实现了不确定性数据的筛选查询和模糊查询。通过关系型数据库与图数据库中数据的存储方式及存储系统的查询效率对比实验表明,所提出的数学模型扩展性更强,实现系统查询效率更高,在处理大规模不确定性数据的存储和检索方面优势更加明显。 Uncertainty is the essential feature of data,and researches on uncertainty data has been paid more and more attention.On the basis of summarizing the current methods of processing uncertain historical data,in order to address the problem of the lack of a semantic framework for processing uncertain historical data,based on the Neo4j graph database,we establish a general mathematical model.The model integrates the time,uncertainty and provenance of historical data based on the bi-temporal model and probabilistic model.The storage system with basic operations of CRUD is implemented in Python which can dynamically increase the relationship between nodes,store and retrieve historical data,and implement filtering queries and fuzzy queries for uncertain data.The comparison between the relational database and the graph database in the storage method of data and the query efficiency of the storage system shows that the proposed mathematical model based on graph database is better in extensibility and the storage system is more efficient in querying,and the advantages are more obvious in dealing with large-scale uncertainty data for storing and retrieving historical data.
作者 郭林斐 刘广钟 GUO Lin-fei;LIU Guang-zhong(School of Information Engineering,Shanghai Maritime University,Shanghai 201306,China)
出处 《计算机技术与发展》 2020年第1期25-31,共7页 Computer Technology and Development
基金 国家自然科学基金(61202370) 上海市教委科研创新项目(14YZ110) 中国博士后科学基金资助项目(2014M561512)
关键词 数字人文学 不确定性 属性图 Neo4j 双时态模型 digital humanities uncertainty property graphs Neo4j bi-temporal model
  • 相关文献

参考文献6

二级参考文献239

  • 1金澈清,钱卫宁,周傲英.流数据分析与管理综述[J].软件学报,2004,15(8):1172-1181. 被引量:161
  • 2谷峪,于戈,张天成.RFID复杂事件处理技术[J].计算机科学与探索,2007,1(3):255-267. 被引量:54
  • 3林穗芳.罗伯托·布萨和世界最早用计算机辅助编辑的巨著《托马斯著作索引》[J].河南大学学报(社会科学版),2007,47(4):167-174. 被引量:17
  • 4Deshpande A, Guestrin C, Madden S, Hellerstein J M, Hong W. Model-driven data acquisition in sensor networks// Proceedings of the 30th International Conference on Very Large Data Bases. Toronto, 2004:588-599
  • 5Madhavan J, Cohen S, Xin D, Halevy A, Jeffery S, Ko D, Yu C. Web-scale data integration: You can afford to pay as you go//Proceedings of the 33rd Biennial Conference on Innovative Data Systems Research. Asilomar, 2007:342-350
  • 6Liu Ling. From data privacy to location privacy: Models and algorithms (tutorial)//Proceedings of the 33rd International Conference on Very Large Data bases. Vienna, 2007: 1429- 1430
  • 7Samarati P, Sweeney L. Generalizing data to provide anonymity when disclosing information (abstract)//Proeeedings of the 17th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. Seattle, 1998:188
  • 8Cavallo R, Pittarelli M. The theory of probabilistic databases//Proceedings of the 13th International Conference on Very Large Data Bases. Brighton, 1987:71-81
  • 9Barbara D, Garcia-Molina H, Porter D. The management of probabilistic data. IEEE Transactions on Knowledge and Data Engineering, 1992, 4(5): 487-502
  • 10Fuhr N, Rolleke T. A probabilistic relational algebra for the integration of information retrieval and database systems. ACM Transactions on Information Systems, 1997, 15(1): 32-66

共引文献357

同被引文献51

引证文献4

二级引证文献22

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部