期刊文献+

基于剪枝处理的多源异构数据双挖掘仿真

Simulation of Dual Mining of Multi-Source Heterogeneous Data Based on Pruning Processing
下载PDF
导出
摘要 多源异构数据可能来自不同领域、不同格式和不同质量的数据源,处理难度较大,针对多源异构数据难以精准挖掘的问题,提出基于决策树分类的多源异构数据挖掘算法。构建决策树划分数据属性,对初始决策树实施剪枝处理,得出多源异构数据属性集,提取出多源异构数据因子,获取粗略的数据挖掘结果。再使用深度学习算法进一步挖掘出其余数据中残存的多源异构数据,并对原始多源异构数据集实施二次挖掘,将粗细挖掘结果整合后实现多源异构数据挖掘。实验结果表明,所提算法的F1值较高,泛化误差较低,数据挖掘性能较强。 Multi-source heterogeneous data may come from different fields and have different formats.In addition,the data source may have different qualities,so it is difficult to process multi-source heterogeneous data.To address the problem of difficulty in accurately mining heterogeneous data from multiple sources,this paper presented a multi-source heterogeneous data mining algorithm based on decision tree classification.At first,we constructed a decision tree to partition data attributes,and then pruned the initial decision tree,thus obtaining an attribute set of multi-source heterogeneous data.Moreover,we extracted the data factors,and thus to obtain rough data mining results.Furthermore,we used the deep learning algorithm to mine the remaining multi-source heterogeneous data,and then implemented secondary mining on the original multi-source heterogeneous dataset.Finally,we achieved the multisource heterogeneous data mining after integrating the coarse and fine mining results.Experimental results show that the proposed algorithm has high F1 value,low generalization error,and strong data mining performance.
作者 刘诗瑾 杨知玲 LIU Shi-jin;YANG Zhi-ling(Zhujiang College of South China Agricultural University,Guangzhou Guangdong 510600,China;Wuhan University,Wuhan Hubei 430072,China)
出处 《计算机仿真》 2024年第8期513-516,534,共5页 Computer Simulation
基金 广东省教育厅本科高校教学质量与教学改革工程项目(粤教高函【2023】4号-1084)。
关键词 决策树 数据分类 多源异构数据 数据挖掘 深度学习算法 Decision tree Data classification Multi-source heterogeneous data Data mining Deep learning algorithm
  • 相关文献

参考文献19

二级参考文献178

共引文献165

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部