摘要
针对数据分析领域中根据数据集自动生成机器学习流程耗时长性能低的问题,从数据特征与服务关联相结合的角度入手,提出一种DFSR(数据特征与服务关联)方法,可根据数据集自动生成机器学习流程,实验结果表明,其生成流程的性能达到了当前AutoML较好水平,在耗时方面缩短至分钟级别,整体结果更加均衡。
Focusing on the problem of long time-consumption and low performance of automatically generating machine learn⁃ing pipelines in the field of data analysis,a method of DFSR is proposed to automatically generate machine learning pipelines for da⁃tasets utilizing data features and service associations.The experimental results show that this method improves the pipeline perfor⁃mance,as well as reduces the time consumption to the minute level,and the overall results are more balanced.
作者
赵汝涛
王菁
ZHAO Rutao;WANG Jing(Beijing Key Laboratory on Integration and Analysis of Large-Scale Stream Data,North China University of Technology,Beijing 100144)
出处
《计算机与数字工程》
2020年第12期2875-2880,共6页
Computer & Digital Engineering
基金
国家自然科学基金重点项目“大数据环境下的大服务理论与方法研究”(编号:61832004)资助。