数据与计算发展前沿 ›› 2024, Vol. 6 ›› Issue (4): 150-162.

CSTR: 32002.14.jfdc.CN10-1649/TP.2024.04.013

doi: 10.11871/jfdc.issn.2096-742X.2024.04.013

• 技术与应用 • 上一篇    下一篇

基于高性能计算环境的科学应用平台工作流设计与实现

武傲1,2(),李天颜1,张宝花1,徐顺1,刘倩1,*()   

  1. 1.中国科学院计算机网络信息中心,北京 100083
    2.中国科学院大学,北京 100049
  • 收稿日期:2023-04-28 出版日期:2024-08-20 发布日期:2024-08-20
  • 通讯作者: *刘倩(E-mail: liuqian@sccas.cn
  • 作者简介:武傲,中国科学院计算机网络信息中心,硕士生,主要研究方向为高性能计算环境科学应用平台及服务体系研究与构建。
    本文中负责高性能计算应用平台的搭建,工作流的设计与实现,工作流性能测试与文章撰写。
    WU Ao is currently a master’s student at the Computer Network Information Center, Chinese Academy of Sciences, China. His main research interests include the research and construction of scientific application platform and service system in high-performance computing environments.
    In this paper, he is responsible for the construction of high-performance computing application platform, the design and implementation of workflow, the performance testing of workflow, and paper writing.
    E-mail: wuao@cnic.cn|刘倩,中国科学院计算机网络信息中心,副研究员,主要研究方向为高性能计算应用服务平台及应用。
    本文承担工作为平台整体设计、研究指导。
    Liu Qian, Computer Network Information Center of the Chinese Academy of Sciences, Associate Researcher. Her main research interests are high-performance computing service platform and application.
    This paper undertakes the following tasks: the overall structure design and research guidance nee of the framework.
    E-mail: liuqian@sccas.cn
  • 基金资助:
    国家重点研发计划课题 “多物理复杂体系科学计算应用平台”(2020YFB0204802);甘肃省科技计划项目“甘肃省生物医药高性能计算示范平台”(21YF5GA005)

Design and Implementation of Workflows of a Scientific Application Platform Based on High Performance Computing Environment

WU Ao1,2(),LI Tianyan1,ZHANG Baohua1,XU Shun1,LIU Qian1,*()   

  1. 1. Computer Network Information Center, Chinese Academy of Sciences, Beijing 100083, China
    2. University of Chinese Academy of Sciences, Beijing 100049, China
  • Received:2023-04-28 Online:2024-08-20 Published:2024-08-20

摘要:

【目的】 为了有效解决领域科学应用计算中的复杂多步计算及高通量计算流程繁琐、低效的问题,本文研究科学应用平台工作流的关键技术。【应用背景】本文将基于高性能计算环境的科学应用平台与工作流的理念相结合,同时适用于多领域、多体系的科学计算软件,为相关高性能计算应用的科学研究与工程研发提供有力支撑。【方法】 针对不同领域应用需求,本文设计实现了多任务连用工作流和高通量应用计算工作流。多任务连用工作流不仅在服务端和客户端设计了一套通用自定义工作流的逻辑方案,让用户能够自主设计多任务连用,还在高性能计算环境中封装领域特色工作流,满足更特殊专有的需求;高通量应用计算工作流在任务间相互独立的情况下,采用多进程并发以及异步上传文件流的方法提高并发程度,在任务间相互关联的情况下,编写脚本生成批量文件后仅与高性能计算环境交互一次,在申请的计算资源下采用了两层主从模式的负载均衡方案实现子任务间的协同并发。【结果】 相较于平台普通提交任务方式,多任务连用工作流可以使用户节省接近10倍的时间,高通量应用计算工作流可以在耗时、易用性和自动化程度等方面展现出显著优势。【结论】 本文设计实现的科学应用平台工作流能够更加高效、自动化地解决众多复杂的应用需求,为广大科研人员带来更优质的高性能计算应用服务。

关键词: 高性能计算应用服务, 工作流, 科学应用平台

Abstract:

[Objective] In order to effectively solve the problems of complex multi-step and high-throughput computing in scientific domain applications, this paper studies the key technologies of workflows for the scientific application platform. [Context] This paper combines the scientific application platform based on the high-performance computing environment with the concept of workflow. It is also suitable for multi-domain and multi-system scientific computing software, providing strong support for scientific research and engineering development of related high-performance computing applications. [Methods] In response to different application requirements in different fields, multi-task concatenation workflow and high-throughput application computing workflow are designed and implemented. The multi-task concatenation workflow not only realizes a general customized workflow logic scheme on the server side and client side, allowing users to design multi-task concatenation independently, but also encapsulates domain-specific workflows in the high-performance computing environment to meet more specific and proprietary requirements. The high-throughput application computing workflow improves efficiency by using multi-process concurrency and asynchronous file upload streams when tasks are independent of each other. When tasks are interrelated, batch files are generated by script, which interacts with the high-performance computing environment only once. Under the available computing resources, a two-layer master-slave mode load balancing scheme is adopted to achieve collaborative concurrency among subtasks. [Results] Compared with the common task submission method on the platform, the multi-task concatenation workflow can save users’ time up to 10 times, and the high-throughput application computing workflow can demonstrate significant advantages in terms of time consumption, ease of use, and degree of automation. [Conclusions] The scientific application platform workflow designed and implemented in this paper can meet numerous complex application requirements in a more efficient and automated way, bringing high quality high-performance computing application services to the majority of researchers.

Key words: high performance computing application services, workflows, scientific application platform