摘要
研究基于决策树算法的人力资源推荐方法,提升人力资源推荐的综合质量与实际应用效果。采用流式分布式数据采集方式从海量数据源内采集人力资源数据,依据数据来源类别归类数据后,储存为原始人力资源数据集。针对原始人力资源数据集的缺陷,通过数据抽取、清洗转换及加载的预处理过程,实施数据预处理并构建数据仓库;将该数据仓库中的数据输入到通过改进ID3决策树算法改进的C4.5决策树算法中,该算法通过多层节点的反复分裂生成决策树,获得能够满足人力资源推荐预设终止条件的同类别分裂结果,实现人力资源推荐。结果表明,所提方法的最佳叶子节点数为15个,在此叶子节点数下该方法的召回率、F1值及覆盖率均较高,能够显著提升人力资源推荐的综合质量与推荐效果。
A human resource recommendation method based on decision tree algorithm is studied to improve the comprehensive quality and practical application effect of human resource recommendation.The human resource data is collected from massive data sources by streaming distributed data collection method.The data are classified according to the data source category and stored as the original human resource data set.In view of the defects of the above data set,data preprocessing is implemented and data warehouse is constructed by the preprocessing process of data extraction,data cleaning&transformation and data loading.The data in the data warehouse is input into the C4.5 decision tree algorithm obtained by improving ID3 decision tree algorithm.The algorithm generates decision tree by repeated splitting of multi⁃layer nodes,and can obtain the same category splitting result that meets the preset termination conditions of human resource recommendation,so as to achieve the human resource recommendation.The results show that the optimal number of leaf nodes of the proposed method is 15.When the number of leaf nodes is kept at this value,the recall rate,F1 value and coverage rate of this method are all high,which can significantly improve the comprehensive quality and recommendation effect of human resource recommendation.
作者
王联英
慈玉鹏
WANG Lianying;CI Yupeng(Qinghai Radio and TV University,Xining 810008,China;Xiang’an Campus of Xiamen University,Xiamen 361102,China)
出处
《现代电子技术》
2021年第3期105-110,共6页
Modern Electronics Technique
关键词
决策树算法
人力资源推荐
数据采集
数据预处理
数据仓库构建
决策树生成
decision tree algorithm
human resource recommendation
data acquisition
data preprocessing
data warehouse construction
decision tree generation