期刊文献+
共找到248篇文章
< 1 2 13 >
每页显示 20 50 100
利用Workflows打印书标和分库典藏账
1
作者 华芳 刘畅 尹进 《农业图书情报学刊》 2006年第8期174-176,共3页
详细介绍了图书馆如何利用美国Sirsi公司的Workflows程序打印书标和分库典藏帐的步骤。每项步骤中都包括设置属性、生成报表、报表转化、调整格式、注意事项等,尤其针对工作需要给出了设置打印格式的方法,具有一定的现实意义。本文还对... 详细介绍了图书馆如何利用美国Sirsi公司的Workflows程序打印书标和分库典藏帐的步骤。每项步骤中都包括设置属性、生成报表、报表转化、调整格式、注意事项等,尤其针对工作需要给出了设置打印格式的方法,具有一定的现实意义。本文还对该打印工作提出了若干建议。 展开更多
关键词 workflows 书标 典藏帐
下载PDF
Efficient Computation Offloading of IoT-Based Workflows Using Discrete Teaching Learning-Based Optimization
2
作者 Mohamed K.Hussein Mohamed H.Mousa 《Computers, Materials & Continua》 SCIE EI 2022年第11期3685-3703,共19页
As the Internet of Things(IoT)and mobile devices have rapidly proliferated,their computationally intensive applications have developed into complex,concurrent IoT-based workflows involving multiple interdependent task... As the Internet of Things(IoT)and mobile devices have rapidly proliferated,their computationally intensive applications have developed into complex,concurrent IoT-based workflows involving multiple interdependent tasks.By exploiting its low latency and high bandwidth,mobile edge computing(MEC)has emerged to achieve the high-performance computation offloading of these applications to satisfy the quality-of-service requirements of workflows and devices.In this study,we propose an offloading strategy for IoT-based workflows in a high-performance MEC environment.The proposed task-based offloading strategy consists of an optimization problem that includes task dependency,communication costs,workflow constraints,device energy consumption,and the heterogeneous characteristics of the edge environment.In addition,the optimal placement of workflow tasks is optimized using a discrete teaching learning-based optimization(DTLBO)metaheuristic.Extensive experimental evaluations demonstrate that the proposed offloading strategy is effective at minimizing the energy consumption of mobile devices and reducing the execution times of workflows compared to offloading strategies using different metaheuristics,including particle swarm optimization and ant colony optimization. 展开更多
关键词 High-performance computing internet of things(IoT) mobile edge computing(MEC) workflows computation offloading teaching learning-based optimization
下载PDF
A New Methodology for Process Modeling of Workflows
3
作者 Sabah Al-Fedaghi Rashid Alloughani Mohammed Al Sanousi 《Journal of Software Engineering and Applications》 2012年第8期560-567,共8页
Workflow-based systems are typically said to lead to better use of staff and better management and productivity. The first phase in building a workflow-based system is capturing the real-world process in a conceptual ... Workflow-based systems are typically said to lead to better use of staff and better management and productivity. The first phase in building a workflow-based system is capturing the real-world process in a conceptual representation suitable for the following phases of formalization and implementation. The specification may be in text or diagram form or written in a formal language. This paper proposes a flow-based diagrammatic methodology as a tool for workflow specification. The expressiveness of the method is appraised though its ability to capture a workflow-based application. Here we show that the proposed conceptual diagrams are able to express situations arising in practice as an alternative to tools currently used in workflow systems. This is demonstrated by using the proposed methodology to partial build demo systems for two government agencies. 展开更多
关键词 PROCESS Specification WORKFLOW CONCEPTUAL Modeling
下载PDF
Harnessing digital workflows for the understanding,promotion and participation in the conservation of heritage sites by meeting both ethical and technical challenges 被引量:4
4
作者 Mario Santana Quintero Reem Awad Luigi Barazzetti 《Built Heritage》 CSCD 2020年第1期56-73,共18页
The current application of digital workflows for the understanding,promotion and participation in the conservation of heritage sites involves several technical challenges and should be governed by serious ethical enga... The current application of digital workflows for the understanding,promotion and participation in the conservation of heritage sites involves several technical challenges and should be governed by serious ethical engagement.Recording consists of capturing(or mapping)the physical characteristics of character-defining elements that provide the significance of cultural heritage sites.Usually,the outcome of this work represents the cornerstone information serving for their conservation,whatever it uses actively for maintaining them or for ensuring a posterity record in case of destruction.The records produced could guide the decision-making process at different levels by property owners,site managers,public officials,and conservators around the world,as well as to present historical knowledge and values of these resources.Rigorous documentation may also serve a broader purpose:over time,it becomes the primary means by which scholars and the public apprehends a site that has since changed radically or disappeared.This contribution is aimed at providing an overview of the potential application and threats of technology utilised by a heritage recording professional by addressing the need to develop ethical principles that can improve the heritage recording practice at large. 展开更多
关键词 DOCUMENTATION Recording Cultural heritage Best practice Ethical commitment Digital workflows
原文传递
Making Data and Workflows Findable for Machines 被引量:5
5
作者 Tobias Weigel Ulrich Schwardmann +2 位作者 Jens Klump Sofiane Bendoukha Robert Quick 《Data Intelligence》 2020年第1期40-46,303,共8页
Research data currently face a huge increase of data objects with an increasing variety of types(data types,formats)and variety of workflows by which objects need to be managed across their lifecycle by data infrastru... Research data currently face a huge increase of data objects with an increasing variety of types(data types,formats)and variety of workflows by which objects need to be managed across their lifecycle by data infrastructures.Researchers desire to shorten the workflows from data generation to analysis and publication,and the full workflow needs to become transparent to multiple stakeholders,including research administrators and funders.This poses challenges for research infrastructures and user-oriented data services in terms of not only making data and workflows findable,accessible,interoperable and reusable,but also doing so in a way that leverages machine support for better efficiency.One primary need to be addressed is that of findability,and achieving better findability has benefits for other aspects of data and workflow management.In this article,we describe how machine capabilities can be extended to make workflows more findable,in particular by leveraging the Digital Object Architecture,common object operations and machine learning techniques. 展开更多
关键词 Findability workflows AUTOMATION FAIR data Data infrastructures Data services
原文传递
Semantic typing of linked geoprocessing workflows 被引量:1
6
作者 Simon Scheider Andrea Ballatore 《International Journal of Digital Earth》 SCIE EI 2018年第1期113-138,共26页
In Geographic Information Systems(GIS),geoprocessing workflows allow analysts to organize their methods on spatial data in complex chains.We propose a method for expressing workflows as linked data,and for semi-automa... In Geographic Information Systems(GIS),geoprocessing workflows allow analysts to organize their methods on spatial data in complex chains.We propose a method for expressing workflows as linked data,and for semi-automatically enriching them with semantics on the level of their operations and datasets.Linked workflows can be easily published on the Web and queried for types of inputs,results,or tools.Thus,GIS analysts can reuse their workflows in a modular way,selecting,adapting,and recommending resources based on compatible semantic types.Our typing approach starts from minimal annotations of workflow operations with classes of GIS tools,and then propagates data types and implicit semantic structures through the workflow using an OWL typing scheme and SPARQL rules by backtracking over GIS operations.The method is implemented in Python and is evaluated on two real-world geoprocessing workflows,generated with Esri's ArcGIS.To illustrate the potential applications of our typing method,we formulate and execute competency questions over these workflows. 展开更多
关键词 GEOPROCESSING spatial analysis workflows semantic typing linked data
原文传递
GEOEssential–mainstreaming workflows from data sources to environment policy indicators with essential variables 被引量:1
7
作者 Anthony Lehmann Stefano Nativi +10 位作者 Paolo Mazzetti Joan Maso Ivette Serral Daniel Spengler Aidin Niamir Ian McCallum Pierre Lacroix Petros Patias Denisa Rodila Nicolas Ray Grégory Giuliani 《International Journal of Digital Earth》 SCIE 2020年第2期322-338,共17页
When defining indicators on the environment,the use of existing initiatives should be a priority rather than redefining indicators each time.From an Information,Communication and Technology perspective,data interopera... When defining indicators on the environment,the use of existing initiatives should be a priority rather than redefining indicators each time.From an Information,Communication and Technology perspective,data interoperability and standardization are critical to improve data access and exchange as promoted by the Group on Earth Observations.GEOEssential is following an end-user driven approach by defining Essential Variables(EVs),as an intermediate value between environmental policy indicators and their appropriate data sources.From international to local scales,environmental policies and indicators are increasingly percolating down from the global to the local agendas.The scientific business processes for the generation of EVs and related indicators can be formalized in workflows specifying the necessary logical steps.To this aim,GEOEssential is developing a Virtual Laboratory the main objective of which is to instantiate conceptual workflows,which are stored in a dedicated knowledge base,generating executable workflows.To interpret and present the relevant outputs/results carried out by the different thematic workflows considered in GEOEssential(i.e.biodiversity,ecosystems,extractives,night light,and food-water-energy nexus),a Dashboard is built as a visual front-end.This is a valuable instrument to track progresses towards environmental policies. 展开更多
关键词 SDGs environmental policies essential variables earth observation knowledge base workflows virtual research environment
原文传递
Distributed Graph Database Load Balancing Method Based on Deep Reinforcement Learning
8
作者 Shuming Sha Naiwang Guo +1 位作者 Wang Luo Yong Zhang 《Computers, Materials & Continua》 SCIE EI 2024年第6期5105-5124,共20页
This paper focuses on the scheduling problem of workflow tasks that exhibit interdependencies.Unlike indepen-dent batch tasks,workflows typically consist of multiple subtasks with intrinsic correlations and dependenci... This paper focuses on the scheduling problem of workflow tasks that exhibit interdependencies.Unlike indepen-dent batch tasks,workflows typically consist of multiple subtasks with intrinsic correlations and dependencies.It necessitates the distribution of various computational tasks to appropriate computing node resources in accor-dance with task dependencies to ensure the smooth completion of the entire workflow.Workflow scheduling must consider an array of factors,including task dependencies,availability of computational resources,and the schedulability of tasks.Therefore,this paper delves into the distributed graph database workflow task scheduling problem and proposes a workflow scheduling methodology based on deep reinforcement learning(DRL).The method optimizes the maximum completion time(makespan)and response time of workflow tasks,aiming to enhance the responsiveness of workflow tasks while ensuring the minimization of the makespan.The experimental results indicate that the Q-learning Deep Reinforcement Learning(Q-DRL)algorithm markedly diminishes the makespan and refines the average response time within distributed graph database environments.In quantifying makespan,Q-DRL achieves mean reductions of 12.4%and 11.9%over established First-fit and Random scheduling strategies,respectively.Additionally,Q-DRL surpasses the performance of both DRL-Cloud and Improved Deep Q-learning Network(IDQN)algorithms,with improvements standing at 4.4%and 2.6%,respectively.With reference to average response time,the Q-DRL approach exhibits a significantly enhanced performance in the scheduling of workflow tasks,decreasing the average by 2.27%and 4.71%when compared to IDQN and DRL-Cloud,respectively.The Q-DRL algorithm also demonstrates a notable increase in the efficiency of system resource utilization,reducing the average idle rate by 5.02%and 9.30%in comparison to IDQN and DRL-Cloud,respectively.These findings support the assertion that Q-DRL not only upholds a lower average idle rate but also effectively curtails the average response time,thereby substantially improving processing efficiency and optimizing resource utilization within distributed graph database systems. 展开更多
关键词 Reinforcement learning WORKFLOW task scheduling load balancing
下载PDF
Developing food,water and energy nexus workflows
9
作者 Ian McCallum Carsten Montzka +15 位作者 Bagher Bayat Stefan Kollet Andrii Kolotii Nataliia Kussul Mykola Lavreniuk Anthony Lehmann Joan Maso Paolo Mazzetti Aline Mosnier Emma Perracchione Mario Putti Mattia Santoro Ivette Serral Leonid Shumilo Daniel Spengler Steffen Fritza 《International Journal of Digital Earth》 SCIE 2020年第2期299-308,共10页
There is a growing recognition of the interdependencies among the supply systems that rely upon food,water and energy.Billions of people lack safe and sufficient access to these systems,coupled with a rapidly growing ... There is a growing recognition of the interdependencies among the supply systems that rely upon food,water and energy.Billions of people lack safe and sufficient access to these systems,coupled with a rapidly growing global demand and increasing resource constraints.Modeling frameworks are considered one of the few means available to understand the complex interrelationships among the sectors,however development of nexus related frameworks has been limited.We describe three opensource models well known in their respective domains(i.e.TerrSysMP,WOFOST and SWAT)where components of each if combined could help decision-makers address the nexus issue.We propose as a first step the development of simple workflows utilizing essential variables and addressing components of the above-mentioned models which can act as building-blocks to be used ultimately in a comprehensive nexus model framework.The outputs of the workflows and the model framework are designed to address the SDGs. 展开更多
关键词 Food water energy nexus modeling workflows modelframework essential variables SDGs
原文传递
Realising Data-Centric Scientific Workflows with Provenance-Capturing on Data Lakes
10
作者 Hendrik Noltet Philipp Wieder 《Data Intelligence》 EI 2022年第2期426-438,共13页
Since their introduction by James Dixon in 2010,data lakes get more and more attention,driven by the promise of high reusability of the stored data due to the schema-on-read semantics.Building on this idea,several add... Since their introduction by James Dixon in 2010,data lakes get more and more attention,driven by the promise of high reusability of the stored data due to the schema-on-read semantics.Building on this idea,several additional requirements were discussed in literature to improve the general usability of the concept,like a central metadata catalog including all provenance information,an overarching data governance,or the integration with(high-performance)processing capabilities.Although the necessity for a logical and a physical organisation of data lakes in order to meet those requirements is widely recognized,no concrete guidelines are yet provided.The most common architecture implementing this conceptual organisation is the zone architecture,where data is assigned to a certain zone depending on the degree of processing.This paper discusses how FAIR Digital Objects can be used in a novel approach to organize a data lake based on data types instead of zones,how they can be used to abstract the physical implementation,and how they empower generic and portable processing capabilities based on a provenance-based approach. 展开更多
关键词 Data lake PROVENANCE workflows FAIRDigital Objects CWFR
原文传递
AudiWFlow:Confidential,collusion-resistant auditing of distributed workflows
11
作者 Xiaohu Zhou Antonio Nehme +4 位作者 Vitor Jesus Yonghao Wang Mark Josephs Khaled Mahbub Ali Abdallah 《Blockchain(Research and Applications)》 2022年第3期1-11,共11页
We discuss the problem of accountability when multiple parties cooperate towards an end result,such as multiple companies in a supply chain or departments of a government service under different authorities.In cases w... We discuss the problem of accountability when multiple parties cooperate towards an end result,such as multiple companies in a supply chain or departments of a government service under different authorities.In cases where a fully trusted central point does not exist,it is difficult to obtain a trusted audit trail of a workflow when each individual participant is unaccountable to all others.We propose AudiWFlow,an auditing architecture that makes participants accountable for their contributions in a distributed workflow.Our scheme provides confidentiality in most cases,collusion detection,and availability of evidence after the workflow terminates.AudiWFlow is based on verifiable secret sharing and real-time peer-to-peer verification of records;it further supports multiple levels of assurance to meet a desired trade-off between the availability of evidence and the overhead resulting from the auditing approach.We propose and evaluate two implementation approaches for AudiWFlow.The first one is fully distributed except for a central auxiliary point that,nevertheless,needs only a low level of trust.The second one is based on smart contracts running on a public blockchain,which is able to remove the need for any central point but requires integration with a blockchain. 展开更多
关键词 AUDITING Distributed workflows CONFIDENTIALITY Blockchains Smart contracts
原文传递
Formal Verification of Temporal Properties for Reduced Overhead in Grid Scientific Workflows 被引量:2
12
作者 曹军威 张帆 +2 位作者 许可 刘连臣 吴澄 《Journal of Computer Science & Technology》 SCIE EI CSCD 2011年第6期1017-1030,共14页
With quick development of grid techniques and growing complexity of grid applications, it is becoming critical for reasoning temporal properties of grid workflows to probe potential pitfalls and errors, in order to en... With quick development of grid techniques and growing complexity of grid applications, it is becoming critical for reasoning temporal properties of grid workflows to probe potential pitfalls and errors, in order to ensure reliability and trustworthiness at the initial design phase. A state Pi calculus is proposed and implemented in this work, which not only enables fexible abstraction and management of historical grid verification of grid workflows. Furthermore, a relaxed region system events, but also facilitates modeling and temporal analysis (RRA) approach is proposed to decompose large scale grid workflows into sequentially composed regions with relaxation of parallel workflow branches, and corresponding verification strategies are also decomposed following modular verification principles. Performance evaluation results show that the RRA approach can dramatically reduce CPU time and memory usage of formal verification. 展开更多
关键词 grid computing workflow management formal verification state Pi calculus
原文传递
FAIR Computational Workflows 被引量:4
13
作者 Carole Goble Sarah Cohen-Boulakia +5 位作者 Stian Soiland-Reyes Daniel Garijo Yolanda Gil Michael R.Crusoe Kristian Peters Daniel Schober 《Data Intelligence》 2020年第1期108-121,307,308,309,共17页
Computational workflows describe the complex multi-step methods that are used for data collection,data preparation,analytics,predictive modelling,and simulation that lead to new data products.They can inherently contr... Computational workflows describe the complex multi-step methods that are used for data collection,data preparation,analytics,predictive modelling,and simulation that lead to new data products.They can inherently contribute to the FAIR data principles:by processing data according to established metadata;by creating metadata themselves during the processing of data;and by tracking and recording data provenance.These properties aid data quality assessment and contribute to secondary data usage.Moreover,workflows are digital objects in their own right.This paper argues that FAIR principles for workflows need to address their specific nature in terms of their composition of executable software steps,their provenance,and their development. 展开更多
关键词 Computational workflow REPRODUCIBILITY Software FAIR data PROVENANCE
原文传递
Canonical Workflows to Make Data FAlR 被引量:2
14
作者 Peter Wittenburg Alex Hardisty +5 位作者 Yann Le Franc Amirpasha Mozaffari Limor Peer Nikolay A.Skvortsov Zhiming Zhao Alessandro Spinuso 《Data Intelligence》 EI 2022年第2期286-305,共20页
The FAIR principles have been accepted globally as guidelines for improving data-driven science and data management practices,yet the incentives for researchers to change their practices are presently weak.In addition... The FAIR principles have been accepted globally as guidelines for improving data-driven science and data management practices,yet the incentives for researchers to change their practices are presently weak.In addition,data-driven science has been slow to embrace workflow technology despite clear evidence of recurring practices.To overcome these challenges,the Canonical Workflow Frameworks for Research(CWFR)initiative suggests a large-scale introduction of self-documenting workflow scripts to automate recurring processes or fragments thereof.This standardised approach,with FAIR Digital Objects as anchors,will be a significant milestone in the transition to FAIR data without adding additional load onto the researchers who stand to benefit most from it.This paper describes the CWFR approach and the activities of the CWFR initiative over the course of the last year or so,highlights several projects that hold promise for the CWFR approaches,including Galaxy,Jupyter Notebook,and RO Crate,and concludes with an assessment of the state of the field and the challenges ahead. 展开更多
关键词 WORKFLOW Data management FAIR Principles Digital Objects
原文传递
Concurrent and Storage-Aware Data Streaming for Data Processing Workflows in Grid Environments 被引量:1
15
作者 张文 曹军威 +2 位作者 钟宜生 刘连臣 吴澄 《Tsinghua Science and Technology》 SCIE EI CAS 2010年第3期335-346,共12页
Data streaming applications, usually composed of sequential/parallel data processing tasks organized as a workflow, bring new challenges to workflow scheduling and resource allocation in grid environments. Due to the ... Data streaming applications, usually composed of sequential/parallel data processing tasks organized as a workflow, bring new challenges to workflow scheduling and resource allocation in grid environments. Due to the high volumes of data and relatively limited storage capability, resource allocation and data streaming have to be storage aware. Also to improve system performance, the data streaming and processing have to be concurrent. This study used a genetic algorithm (GA) for workflow scheduling, using on-line measurements and predictions with gray model (GM). On-demand data streaming is used to avoid data overflow through repertory strategies. Tests show that tasks with on-demand data streaming must be balanced to improve overall performance, to avoid system bottlenecks and backlogs of intermediate data, and to increase data throughput for the data processing workflows as a whole. 展开更多
关键词 GRID data streaming CONCURRENT storage-aware WORKFLOW
原文传递
HPC-oriented Canonical Workflows for Machine Learning Applications in Climate and Weather Prediction 被引量:1
16
作者 Amirpasha Mozaffari Michael Langguth +7 位作者 Bing Gong Jessica Ahring Adrian Rojas Campos Pascal Nieters Otoniel Jose Campos Escobar Martin Wittenbrink Peter Baumann Martin G.Schultz 《Data Intelligence》 EI 2022年第2期271-285,共15页
Machine learning(ML)applications in weather and climate are gaining momentum as big data and the immense increase in High-performance computing(HPC)power are paving the way.Ensuring FAIR data and reproducible ML pract... Machine learning(ML)applications in weather and climate are gaining momentum as big data and the immense increase in High-performance computing(HPC)power are paving the way.Ensuring FAIR data and reproducible ML practices are significant challenges for Earth system researchers.Even though the FAIR principle is well known to many scientists,research communities are slow to adopt them.Canonical Workflow Framework for Research(CWFR)provides a platform to ensure the FAIRness and reproducibility of these practices without overwhelming researchers.This conceptual paper envisions a holistic CWFR approach towards ML applications in weather and climate,focusing on HPC and big data.Specifically,we discuss Fair Digital Object(FDO)and Research Object(RO)in the DeepRain project to achieve granular reproducibility.DeepRain is a project that aims to improve precipitation forecast in Germany by using ML.Our concept envisages the raster datacube to provide data harmonization and fast and scalable data access.We suggest the Juypter notebook as a single reproducible experiment.In addition,we envision JuypterHub as a scalable and distributed central platform that connects all these elements and the HPC resources to the researchers via an easy-to-use graphical interface. 展开更多
关键词 FAIR REPRODUCIBILITY Machine learning Earth system sciences WORKFLOW
原文传递
Mobility-Aware and Energy-Efficient Task Offloading Strategy for Mobile Edge Workflows
17
作者 QIN Zhiwei LI Juan +1 位作者 LIU Wei YU Xiao 《Wuhan University Journal of Natural Sciences》 CAS CSCD 2022年第6期476-488,共13页
With the rapid growth of the Industrial Internet of Things(IIoT), the Mobile Edge Computing(MEC) has coming widely used in many emerging scenarios. In MEC, each workflow task can be executed locally or offloaded to ed... With the rapid growth of the Industrial Internet of Things(IIoT), the Mobile Edge Computing(MEC) has coming widely used in many emerging scenarios. In MEC, each workflow task can be executed locally or offloaded to edge to help improve Quality of Service(QoS) and reduce energy consumption. However, most of the existing offloading strategies focus on independent applications, which cannot be applied efficiently to workflow applications with a series of dependent tasks. To address the issue,this paper proposes an energy-efficient task offloading strategy for large-scale workflow applications in MEC. First, we formulate the task offloading problem into an optimization problem with the goal of minimizing the utility cost, which is the trade-off between energy consumption and the total execution time. Then, a novel heuristic algorithm named Green DVFS-GA is proposed, which includes a task offloading step based on the genetic algorithm and a further step to reduce the energy consumption using Dynamic Voltage and Frequency Scaling(DVFS) technique. Experimental results show that our proposed strategy can significantly reduce the energy consumption and achieve the best trade-off compared with other strategies. 展开更多
关键词 workflow application task offloading energy saving heuristic algorithm mobile edge computing
原文传递
Scaling Notebooks as Re-configurable Cloud Workflows
18
作者 Yuandou Wang Spiros Koulouzis +5 位作者 Riccardo Bianchi Na Li Yifang Shi Joris Timmermans W.Daniel Kissling Zhiming Zhao 《Data Intelligence》 EI 2022年第2期409-425,共17页
Literate computing environments,such as the Jupyter(i.e.,Jupyter Notebooks,JupyterLab,and JupyterHub),have been widely used in scientific studies;they allow users to interactively develop scientific code,test algorith... Literate computing environments,such as the Jupyter(i.e.,Jupyter Notebooks,JupyterLab,and JupyterHub),have been widely used in scientific studies;they allow users to interactively develop scientific code,test algorithms,and describe the scientific narratives of the experiments in an integrated document.To scale up scientific analyses,many implemented Jupyter environment architectures encapsulate the whole Jupyter notebooks as reproducible units and autoscale them on dedicated remote infrastructures(e.g.,highperformance computing and cloud computing environments).The existing solutions are stl limited in many ways,e.g.,1)the workflow(or pipeline)is implicit in a notebook,and some steps can be generically used by different code and executed in parallel,but because of the tight cell structure,all steps in the Jupyter notebook have to be executed sequentially and lack of the flexibility of reusing the core code fragments,and 2)there are performance bottlenecks that need to improve the parallelism and scalability when handling extensive input data and complex computation.In this work,we focus on how to manage the workflow in a notebook seamlessly.We 1)encapsulate the reusable cells as RESTful services and containerize them as portal components,2)provide a composition tool for describing workflow logic of those reusable components,and 3)automate the execution on remote cloud infrastructure.Empirically,we validate the solution's usability via a use case from the Ecology and Earth Science domain,illustrating the processing of massive Light Detection and Ranging(LiDAR)data.The demonstration and analysis show that our method is feasible,but that it needs further improvement,especially on integrating distributed workflow scheduling,automatic deployment,and execution to develop as a mature approach. 展开更多
关键词 Scientific experiments Jupyter Notebooks Workflow management Ecosystem structure data products CLOUD SCALABILITY
原文传递
Canonical Workflows in Simulation-based Climate Sciences
19
作者 Ivonne Anders Karsten Peters-von Gehlen Hannes Thiemann 《Data Intelligence》 EI 2022年第2期212-225,共14页
In this paper we present the derivation of Canonical Workflow Modules from current workflows in simulation-based climate science in support of the elaboration of a corresponding framework for simulationbased research.... In this paper we present the derivation of Canonical Workflow Modules from current workflows in simulation-based climate science in support of the elaboration of a corresponding framework for simulationbased research.We first identified the different users and user groups in simulation-based climate science based on their reasons for using the resources provided at the German Climate Computing Center(DKRZ).What is special about this is that the DKRZ provides the climate science community with resources like high performance computing(HPC),data storage and specialised services,and hosts the World Data Center for Climate(WDCC).Therefore,users can perform their entire research workflows up to the publication of the data on the same infrastructure.Our analysis shows,that the resources are used by two primary user types:those who require the HPC-system to perform resource intensive simulations to subsequently analyse them and those who reuse,build-on and analyse existing data.We then further subdivided these top-level user categories based on their specific goals and analysed their typical,idealised workflows applied to achieve the respective project goals.We find that due to the subdivision and further granulation of the user groups,the workflows show apparent differences.Nevertheless,similar"Canonical Workflow Modules"can be clearly made out.These modules are"Data and Software(Re)use","Compute","Data and Software Storing","Data and Software Publication","Generating Knowledge"and in their entirety form the basis for a Canonical Workflow Framework for Research(CWFR).It is desirable that parts of the workflows in a CWFR act as FDOs,but we view this aspect critically.Also,we reflect on the question whether the derivation of Canonical Workflow modules from the analysis of current user behaviour still holds for future systems and work processes. 展开更多
关键词 Canonical Workflow Fair Digital Objects HPC infrastructures Data services Climate science
原文传递
Evaluation of Application Possibilities for Packaging Technologies in Canonical Workflows
20
作者 Thomas Jejkal Sabrine Chelbi +1 位作者 Andreas Pfeil Peter Wittenburg 《Data Intelligence》 EI 2022年第2期372-385,共14页
InCanonicalWorkflowFramework forResearch(CWFR)"packages"arerelevantin twodifferentdirections.In data science,workflows are in general being executed on a set of files which have been aggregated for specific ... InCanonicalWorkflowFramework forResearch(CWFR)"packages"arerelevantin twodifferentdirections.In data science,workflows are in general being executed on a set of files which have been aggregated for specific purposes,such as for training a model in deep learning.We call this type of"package"a data collection and its aggregation and metadata description is motivated by research interests.The other type of"packages"relevant for CWFR are supposed to represent workflows in a self-describing and self-contained way for later execution.In this paper,we will review different packaging technologies and investigate their usability in the context of CWFR.For this purpose,we draw on an exemplary use case and show how packaging technologies can support its realization.We conclude that packaging technologies of different flavors help on providing inputs and outputs for workflow steps in a machine-readable way,as well as on representing a workflow and all its artifacts in a self-describing and self-contained way. 展开更多
关键词 Canonical Workflow Framework for Research Packaging technologies Research data collections Packaging formats
原文传递
上一页 1 2 13 下一页 到第
使用帮助 返回顶部