摘要
为了保证多学科领域背景下的数据挖掘分析与知识发现,使大数据分析兼顾领域复杂性、数据分析易用性和执行高效性。提出了领域业务驱动的大数据分析流程建模,指导大数据分析流程模型的构建及实施,将大数据分析流程划分为面向领域和面向平台的双层模型,并通过基于模型驱动的模型映射方法自动完成二者之间的转换。其中面向领域的分析模型从领域业务角度进行定义,使用户在大数据分析流程构建阶段专注于分析逻辑本身,无需关心特定算法计算以及流程执行的实现细节;面向平台的分析模型从计算和执行的角度进行定义,使大数据分析流程在执行阶段充分利用平台的计算资源、存储资源和算法资源,提高流程的执行效率和算法的计算速度。最后,以Hadoop平台作为底层执行平台为例,阐述了面向领域的大数据分析流程模型的转换过程。
In order to ensure data mining analysis and knowledge discovery in the context of multi-disciplinary fields,big data analysis takes into account domain complexity,data analysis ease of use,and efficient execution.The domain-driven big data analysis process modeling is proposed to guide the construction and implementation of big data analysis process models.The big data analysis process is divided into a domain-oriented and platform-oriented two-layer model,and the transformation between the two is automatically done through the model-driven model mapping method.The domain-oriented analysis model is defined from the perspective of domain business,allowing users to focus on the analysis logic itself during the big data analysis process construction phase,without concern for the implementation details of specific algorithms.The platform-oriented analysis model is defined from the perspective of calculation and execution,so that the big data analysis process makes full use of the platform’s computing resources,storage resources,and algorithm resources during the execution phase,the process execution efficiency and algorithm calculation speed are improved.Finally,the Hadoop platform as the underlying execution platform is taken as an example to illustrate the transformation process of the domain-oriented big data analysis process model.
作者
文必龙
李艳春
WEN Bilong;LI Yanchun(School of Computer and Information Technology,Northeast Petroleum University,Daqing 163318)
出处
《计算机与数字工程》
2022年第4期865-870,906,共7页
Computer & Digital Engineering
基金
国家自然科学基金面上项目(编号:41574117)
国家重大专项(编号:2016ZX05033-005-004)资助。