摘要
要实现高效的大数据机器学习,需要构建一个能同时支持机器学习算法设计和大规模数据处理的一体化大数据机器学习系统。研究设计高效、可扩展且易于使用的大数据机器学习系统面临诸多技术挑战。近年来,大数据浪潮的兴起,推动了大数据机器学习的迅猛发展,使大数据机器学习系统成为大数据领域的一个热点研究问题。介绍了国内外大数据机器学习系统的基本概念、基本研究问题、技术特征、系统分类以及典型系统;在此基础上,进一步介绍了本实验室研究设计的一个跨平台统一大数据机器学习系统——Octopus(大章鱼)。
To achieve efficient big data machine learning, we need to construct a unified big data machine learning system to support both machine learning algorithm design and big data processing. Designing an efficient, scalable and easy-to-use big data machine learning system still faces a number of challenges. Recently, the upsurge of big data technology has promoted rapid development of big data machine learning, making big data machine learning system to become a research hotspot. The basic concepts, research issues, technical characteristics, categories, and typical systems for big data machine learning system, were reviewed. Then a unified and cross-platform big data machine learning system, Octopus, was presented.
出处
《大数据》
2015年第1期28-47,共20页
Big Data Research
基金
江苏省科技支撑计划基金资助项目(No.BE2014131)~~
关键词
大数据
机器学习
分布并行计算
大数据处理平台
big data, machine learning, distributed and parallel computing, big data processing platform