期刊文献+

机器学习化数据库系统研究综述 被引量:31

Survey on Machine Learning for Database Systems
下载PDF
导出
摘要 数据库系统经过近50年的发展,虽然已经普遍商用,但随着大数据时代的到来,数据库系统在2个方面面临挑战.首先数据量持续增大期望单个查询任务具有更快的处理速度;其次查询负载的快速变化及其多样性使得基于DBA经验的数据库配置和查询优化偏好不能实时地调整为最佳运行时状态.而数据库系统的性能优化进入瓶颈期,优化空间收窄,进一步优化只能依托新的硬件加速器来实现,传统的数据库系统不能够有效利用现代的硬件加速器;数据库系统具有成百个可调参数,面对工作负载频繁变化,大量繁琐的参数配置已经超出DBA的能力,这使得数据库系统面对快速而又多样性的变化缺乏实时响应能力.当下机器学习技术恰好同时符合这2个条件:应用现代加速器以及从众多参数调节经验中学习.机器学习化数据库系统将机器学习技术引入到数据库系统设计中.一方面将顺序扫描转化为计算模型,从而能够利用现代硬件加速平台;另一方面将DBA的经验转化为预测模型,从而使得数据库系统更加智能地动态适应工作负载的快速多样性变化.将对机器学习化数据库系统当前的研究工作进行总结与归纳,主要包括存储管理、查询优化的机器学习化研究以及自动化的数据库管理系统.在对已有技术分析的基础上,指出了机器学习化数据库系统的未来研究方向及可能面临的问题与挑战. As one of the most popular technologies, database systems have been developed for more than 50 years, and are mature enough to support many real scenarios. Although many researches still focus on the traditional database optimization tasks, the performance improvement is little. Actually, with the advent of big data, we have met the new gap obstructing the further performance improvement of database systems. The database systems face challenges in two aspects. Firstly, the increase of data volume requires the database system to process tasks more quickly. Secondly, the rapid change of query workload and its diversity make database systems impossible to adjust the system knobs to the optimal configuration in real time. Fortunately, machine learning may be the dawn bringing an unprecedented opportunity for the traditional database systems to lead us to the new optimization direction. In this paper, we introduce how to combine machine learning into the further development of database management systems. We focus on the current research work of machine learning for database systems, mainly including the machine learning for storage management and query optimization, as well as automatic database management systems. This area has also opened various challenges and problems to be solved. Thus, based on the analysis of existing technologies, the future challenges, which may be encountered in machine learning for database systems, are pointed out.
作者 孟小峰 马超红 杨晨 Meng Xiaofeng;Ma Chaohong;Yang Chen(School of Information,Renmin University of China,Beijing 100872)
出处 《计算机研究与发展》 EI CSCD 北大核心 2019年第9期1803-1820,共18页 Journal of Computer Research and Development
基金 国家自然科学基金项目(61532016,61532010,91846204,91646203,61762082) 国家重点研发计划项目(2016YFB1000602,2016YFB1000603)~~
关键词 数据库系统 机器学习 学习化索引 自动化数据库系统 database systems machine learning learned index automatic database systems
  • 相关文献

参考文献4

二级参考文献187

  • 1LOHMAN G M,LIGHTSTONE S.SMART:making DB2(more)au-tonomic[C]//Proc of the 28th International Conference on VeryLarge Data Bases.[S.l.]:VLDB Endowment,2002:877-879.
  • 2FINKELSTEIN S,SCHKOLNICK M,TIBERIO P.Physical databasedesign for relational databases[J].ACM Trans on Database Sys-tems,1988,13(1):91-128.
  • 3RAO Jun,ZHANG Chun,MEGIDDO N,et al.Automating physicaldatabase design in a parallel database[C]//Proc of ACM SIGMODInternational Conference on Management of Data.New York:ACM,2002:558-569.
  • 4BRUNO N,CHAUDHURI S.Automatic physical database tuning:arelaxation-based approach[C]//Proc of ACM SIGMOD InternationalConference on Management of Data.New York:ACM,2005:227-238.
  • 5ZILIO D C,RAO Jun,LIGHTSTONE S,et al.DB2 design advisor:in-tegrated automatic physical database design[C]//Proc of the 30th In-ternational Conference on Very Large Data Bases.2004:1087-1097.
  • 6AGRAWAL S,CHAUDHURI S,KOLLAR L,et al.Database tuning ad-visor for Microsoft SQL Server 2005:demo[C]//Proc of the 30th Inter-national Conference on Very Large Data Bases.2004:1110-1121.
  • 7CHAUDHURI S,NARASAYYA V.AutoAdmin“what-if”index ana-lysis utility[C]//Proc of ACM SIGMOD International Conference onManagement of Data.New York:ACM,1998:367-378.
  • 8AGRAWAL S,NARASAYYA V,YANG B.Integrating vertical and ho-rizontal partitioning into automated physical database design[C]//Proc of ACM SIGMOD International Cnnference on Management ofData.New York:ACM,2004:359-370.
  • 9AGRAWAL S,CHAUDHURI S,NARASAYYA V R.Automated se-lection of materialized views and indexes for SQL databases[C]//Proc of the 26th International Conference on Very Large Data Bases.San Francisco:Morgan Kaufmann Publisher,2000:496-505.
  • 10CHAUDHURI S,GUPTA A K,NARASAYYA V.Compressing SQLworkload[C]//Proc of ACM SIGMOD International Conference onManagement of Data.New York:ACM,2002:488-499.

共引文献2535

同被引文献176

引证文献31

二级引证文献136

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部