摘要
对分布式数据库的用户丢失数据的恢复,能够有效提高数据库数据完整性以及用户信息安全性。对分布式数据库用户丢失数据的恢复重构,需要计算最近邻居的加权系数,获得用户丢失数据的填充值,完成用户丢失数据的恢复重构。传统方法建立用户数据缺失项的取样矩阵,作为压缩感知框架的测量矩阵,但忽略了用户丢失数据的填充值的求取,导致数据恢复效果不理想。提出采用遗传优化方法估计用户丢失数据的参数,获得最优数据参数,在最优参数的基础上,通过基因之间的马氏距离来选取最近邻居基因,将已获得的用户丢失数据缺失参数估计值应用至后续用户丢失数据恢复重构过程中,采用熵值的思想计算最近邻居的加权系数,获得分布式数据库用户丢失数据的填充值。实验结果表明,该方法对于不同缺失模式下的数据,在恢复精度上优于其它的数据恢复方法,在规模较大的数据集上,数据恢复重构性能能够进一步得到提升。
Traditional data recovery methods often ignore the solution of padding value of user data, which results in the unsatisfactory data recovery. In order to get the optimal data parameter, a genetic optimization method was used to estimate the parameters of user missing data. On the basis of optimal parameters, Mahalanobis distance between genes was used to select the nearest neighbor gene. Then, the estimated value of missing parameter of user missing data were applied to subsequent restoration and reconstruction of user missing data. The thought of entropy value was used to calculate the weight coefficient of nearest neighbor. Thus, the padding value of user missing data in distributed database was obtained. Simulation proves that, for data in different deletion models, the proposed method is superior to other data recovery modes on recovery accuracy. In large - scale data set, the reconstruction performance of data can be further improved.
作者
何丹丹
王立娟
HE Dan - dan;WANG Li - juan(Dalian Institute of Science and Technology, Dalian Liaoning 116052, China)
出处
《计算机仿真》
北大核心
2018年第6期375-379,共5页
Computer Simulation
基金
辽宁省民办教育学会2017年科研立项课题(LMJK2017075)
关键词
数据库用户
丢失数据
恢复重构
Database user
Missing data
Restoration and reconstruction