摘要
针对传统关系型数据库很难满足数据的快速存储与检索的问题,研究了基于数据文件字段映射表、文件对象字段、HBase列映射表和存储转换执行方案映射表解决文件对象的异构性和存储转换的通用性问题。提出了自定义RowKey行键的规则与生成算法,给出了基于映射表与行键的数据转换与存储流程及算法;最后基于行键前缀匹配或关键字匹配方式实现了不同需求的数据快速访问与检索,且具有较强的通用性。
In traditional and relational database,it is difficult to meet the needs of data storage and quick retrieval due to such huge amount of data. For this problem,this paper studied the mapping model of data file field,the relationship between data file field and HBase column,and the importing execution plan,which had solved the heterogeneity of file object and the universality of storage transformation. It put forward the"RowKey"generation rules and algorithm,and provided the algorithm of data transformation based on mapping model and "RowKey". Then according to the mapping rules of data file fields with HBase table column,the data in data file was transformed into HBase. It realized the fast data access and retrieval according to prefix matching in"RowKey"or keyword matching,which had strong commonality and could be widely used in HBase large data storage applications.
作者
圣文顺
徐爱萍
Sheng Wenshun;Xu Aiping(Pujiang Institute,Nanjing Tech University,Nanjing 211200,China;School of Computer,Wuhan University,Wuhan 430072,China)
出处
《计算机应用研究》
CSCD
北大核心
2019年第12期3806-3810,共5页
Application Research of Computers
基金
国家重点研发计划重点专项资助项目(2017YFC0803700)
江苏省高校自然科学研究面上项目(19KJD520005)
关键词
大数据
文件存储
行键
特征值
快速检索
big data
file storage
row key
eigenvalue
rapid retrieval