摘要
为提高数据挖掘处理效率,提出一种基于单位层次树的归并计算方法。以装备维修保障相关数据为例,建立叶子节点的自下而上逐层归并计算模型,对MapReduce并行计算模型进行改进,采用完全分布式模式实现Hadoop Map/Reduce分布式处理架构,实现面向单位层次树的属性约简与数据归并,对MapReduce分布式模型的加速比和可扩展性进行分析。结果表明:该方法取得了较好的效果,具有一定理论价值。
In order to improve the efficiency of data mining, a merge calculation algorithm based on unit hierarchical tree is proposed. Taking equipment maintenance support related data as example, establish a bottom-up layer-by-layer merge calculation model of leaf nodes, improve the MapReduce parallel calculation model, and adopt a fully distributed mode to implement the Hadoop Map/Reduce distributed processing architecture to achieve unit-oriented level. The attribute reduction of the tree and data merging are used to analyze the speedup and scalability of the MapReduce distributed model.The results show that the method has achieved good results and has certain theoretical value.
作者
罗晓玲
陈财森
向阳霞
金传洋
Luo Xiaoling;Chen Caisen;Xiang Yangxia;Jin Chuanyang(Department of Information&Communication,Army Academy of Armored Forces,Beijing 100072,China)
出处
《兵工自动化》
2021年第12期1-4,共4页
Ordnance Industry Automation
关键词
数据预处理
归并计算
分布式计算
data preprocessing
merging algorithm
distributed computing