摘要
针对云存储数据源分散、难于集中的特点,根据代理提取分类规则数与每个代理提取误差率以及整体提取误差率之间的关系,提出一种基于遗传算法的云存储分类规则提取方法。在代理端分布式提取分类规则后传输到中心数据库进行归并,从而达到分布式提取分类规则的目的,通过理论推导得出每个代理提取误差率和整体提取误差率的上限随着提取规则数的增加而递减。实验结果证明,在提取规则数足够多的情况下,分布式提取的回归准确率和集中式提取的回归准确率的差值趋于常数,保证了云存储分布式分类规则提取的可行性。
Aiming to data source's decentralized characteristic in cloud storage, taking consideration the problem of the relationship between extraction classify rule number and each agent and whole system's error rate, by using method of extracting the rule in distributed agents and merge rule set in center rule database under cloud storage situation, this paper proposes a guideline of the decreasing error rate of each agent and error rate upper limit of whole system with increasing extraction classify rule number under cloud storage distribution situation. Though formal proofing and theoretical derivation, the correctness of the proposed criterions is proved. The correctness of theoretical derivation is verified by the experiment, and experiment also shows that difficult between the return classification accuracy rate of distribution extract method and centralized extract method are approaching to a constant which proves the feasibility of the distribution extract method in this paper.
出处
《计算机工程》
CAS
CSCD
2013年第7期45-50,共6页
Computer Engineering
基金
上海市自然科学基金资助项目(10ZR1410400)
关键词
遗传算法
云存储
基于规则的分类器
分类规则提取
代理规则归并
误差率
Genetic Algorithm(GA)
cloud storage
rule-based classifier
classification rule extraction
agent rule merging
error rate