摘要
【目的】为了有效解决领域科学应用计算中的复杂多步计算及高通量计算流程繁琐、低效的问题,本文研究科学应用平台工作流的关键技术。【应用背景】本文将基于高性能计算环境的科学应用平台与工作流的理念相结合,同时适用于多领域、多体系的科学计算软件,为相关高性能计算应用的科学研究与工程研发提供有力支撑。【方法】针对不同领域应用需求,本文设计实现了多任务连用工作流和高通量应用计算工作流。多任务连用工作流不仅在服务端和客户端设计了一套通用自定义工作流的逻辑方案,让用户能够自主设计多任务连用,还在高性能计算环境中封装领域特色工作流,满足更特殊专有的需求;高通量应用计算工作流在任务间相互独立的情况下,采用多进程并发以及异步上传文件流的方法提高并发程度,在任务间相互关联的情况下,编写脚本生成批量文件后仅与高性能计算环境交互一次,在申请的计算资源下采用了两层主从模式的负载均衡方案实现子任务间的协同并发。【结果】相较于平台普通提交任务方式,多任务连用工作流可以使用户节省接近10倍的时间,高通量应用计算工作流可以在耗时、易用性和自动化程度等方面展现出显著优势。【结论】本文设计实现的科学应用平台工作流能够更加高效、自动化地解决众多复杂的应用需求,为广大科研人员带来更优质的高性能计算应用服务。
[Objective]In order to effectively solve the problems of complex multi-step and high-through-put computing in scientific domain applications,this paper studies the key technologies of workflows for the scientific application platform.[Context]This paper combines the scientific application platform based on the highperformance computing environment with the concept of workflow.It is also suitable for multi-domain and multisystem scientific computing software,providing strong support for scientific research and engineering development of related high-performance computing applications.[Methods]In response to different application requirements in different fields,multi-task concatenation workflow and high-throughput application computing workflow are designed and implemented.The multi-task concatenation workflow not only realizes a general customized workflow logic scheme on the server side and client side,allowing users to design multi-task concatenation independently,but also encapsulates domain-specific workflows in the high-performance computing environment to meet more specific and proprietary requirements.The high-throughput application computing workflow improves efficiency by using multi-process concurrency and asynchronous file upload streams when tasks are independent of each other.When tasks are interrelated,batch files are generated by script,which interacts with the high-performance computing environment only once.Under the available computing resources,a two-layer master-slave mode load balancing scheme is adopted to achieve collaborative concurrency among subtasks.[Results]Compared with the common task submission method on the platform,the multi-task concatenation workflow can save users’time up to 10 times,and the high-throughput application computing workflow can demonstrate significant advantages in terms of time consumption,ease of use,and degree of automation.[Conclusions]The scientific application platform workflow designed and implemented in this paper can meet numerous complex application requirements in a more efficient and automated way,bringing high quality high-performance computing application services to the majority of researchers.
作者
武傲
李天颜
张宝花
徐顺
刘倩
WU Ao;LI Tianyan;ZHANG Baohua;XU Shun;LIU Qian(Computer Network Information Center,Chinese Academy of Sciences,Beijing 100083,China;University of Chinese Academy of Sciences,Beijing 100049,China)
基金
国家重点研发计划课题“多物理复杂体系科学计算应用平台”(2020YFB0204802)
甘肃省科技计划项目“甘肃省生物医药高性能计算示范平台”(21YF5GA005)。
关键词
高性能计算应用服务
工作流
科学应用平台
high performance computing application services
workflows
scientific application platform