摘要
【目的】尽管数据科学已经可以处理大量的数据并解决了很多问题,正在改变着科研、企业运作和社会治理模式,但数据科学成果存在难以工程化的局限性,要将数据资产及其隐含价值有效转化为服务、决策、产品,形成数字经济,还需要建立数据工程学来支持对数据实施工程活动,实现数据驱动的数据价值转化,服务日常工作,形成数字经济。【方法】本文引入工程学思想,将伴随数据科学诞生的狭义数据工程推广为广义数据工程,论述了数据工程学建立的必要性,参考土木工程学科建设及工程学科应具备的特征,分析了基于数据物质基础的数据工程学知识特征,给出了数据工程学的概念、理论基础、研究内容、研究框架和主要技术体系,并通过两个数据工程应用案例说明建立数据工程这一新方法论的必要性。【结论】数据工程学具备了数据物质基础的独特知识体系,具备了综合数学、电子与信息、计算机、数据科学以及各领域学科的特殊研究方法,数据工程学建设的物质、理论、技术、需求等基础已经具备,建立数据工程学支持将数据资产转化为工程应用并形成数字经济非常迫切。
[Objective]While data science can handle a large amount of data and solve a lot of problems,it is changing the models of scientific research,enterprise operation,and social governance.Owing to the difficulty in data science engineering,it is necessary to establish a data engineering discipline to convert the data assets and their intrinsic value to effective services,decision making,and data products to enabledigital economy.[Methods]This paper introduces the idea of engineering,extends the concept of narrow data engineering to broad data engineering,discusses the necessity of establishing the discipline of data engineering,and analyzes the characteristics of the data engineering knowledge based on data material basis by referring to the characteristics of the civil engineering discipline and its construction.This paper presents the concept,theoretical basis,research content,research framework,and main technical system of the data engineering discipline,and illustrates the necessity of establishing a new methodology of data engineering through two data engineering application cases.[Conclusions]The data engineering discipline is of a unique knowledge system based on data matters and special research methods that integrate mathematics,electronics,information science,computer science,data science,and some other disciplines.The material,theoretical,technical,and demand basis for data engineering construction have been established.It is urgent to establish a data engineering support to transform data assets into engineering applications to enable the digital economy.
作者
张耀南
ZHANG Yaonan(National Cryosphere Desert Scientific Data Center,Lanzhou,Gansu 730000,China;Northwest Institute of Eco-Environment and Resources,Chinese Academy of Sciences,Lanzhou,Gansu 730000,China;Gansu Data Engineering and Technology Research Center for Resource and Environment,Lanzhou,Gansu 730000,China)
出处
《数据与计算发展前沿》
CSCD
2022年第1期5-19,共15页
Frontiers of Data & Computing
基金
中国科学院信息化项目“寒旱区环境研究科技领域云建设与应用”(XXH13506)。
关键词
狭义数据工程
广义数据工程
数据工程学
数据科学
数字经济
narrow data engineering
generalized data engineering
Data Engineering Discipline
data science
the digital economy