摘要
不确定性是数据的本质特征,对不确定性数据的研究得到了越来越多领域的关注。在总结当前处理历史数据不确定性方法的基础上,针对缺乏处理不确定性历史数据的语义框架问题,基于Neo4j图数据库建立用于处理不确定性历史数据的通用数学模型。该模型以双时态模型、概率模型等为依托,整合了历史数据的时间、不确定性与世系三个方面。并基于Python语言实现了具有CRUD基本操作的存储系统,可动态增加节点之间的关系、存储和检索历史数据、实现了不确定性数据的筛选查询和模糊查询。通过关系型数据库与图数据库中数据的存储方式及存储系统的查询效率对比实验表明,所提出的数学模型扩展性更强,实现系统查询效率更高,在处理大规模不确定性数据的存储和检索方面优势更加明显。
Uncertainty is the essential feature of data,and researches on uncertainty data has been paid more and more attention.On the basis of summarizing the current methods of processing uncertain historical data,in order to address the problem of the lack of a semantic framework for processing uncertain historical data,based on the Neo4j graph database,we establish a general mathematical model.The model integrates the time,uncertainty and provenance of historical data based on the bi-temporal model and probabilistic model.The storage system with basic operations of CRUD is implemented in Python which can dynamically increase the relationship between nodes,store and retrieve historical data,and implement filtering queries and fuzzy queries for uncertain data.The comparison between the relational database and the graph database in the storage method of data and the query efficiency of the storage system shows that the proposed mathematical model based on graph database is better in extensibility and the storage system is more efficient in querying,and the advantages are more obvious in dealing with large-scale uncertainty data for storing and retrieving historical data.
作者
郭林斐
刘广钟
GUO Lin-fei;LIU Guang-zhong(School of Information Engineering,Shanghai Maritime University,Shanghai 201306,China)
出处
《计算机技术与发展》
2020年第1期25-31,共7页
Computer Technology and Development
基金
国家自然科学基金(61202370)
上海市教委科研创新项目(14YZ110)
中国博士后科学基金资助项目(2014M561512)