Frontiers of Data and Computing ›› 2024, Vol. 6 ›› Issue (4): 150-162.

CSTR: 32002.14.jfdc.CN10-1649/TP.2024.04.013

doi: 10.11871/jfdc.issn.2096-742X.2024.04.013

• Technology and Application • Previous Articles     Next Articles

Design and Implementation of Workflows of a Scientific Application Platform Based on High Performance Computing Environment

WU Ao1,2(),LI Tianyan1,ZHANG Baohua1,XU Shun1,LIU Qian1,*()   

  1. 1. Computer Network Information Center, Chinese Academy of Sciences, Beijing 100083, China
    2. University of Chinese Academy of Sciences, Beijing 100049, China
  • Received:2023-04-28 Online:2024-08-20 Published:2024-08-20

Abstract:

[Objective] In order to effectively solve the problems of complex multi-step and high-throughput computing in scientific domain applications, this paper studies the key technologies of workflows for the scientific application platform. [Context] This paper combines the scientific application platform based on the high-performance computing environment with the concept of workflow. It is also suitable for multi-domain and multi-system scientific computing software, providing strong support for scientific research and engineering development of related high-performance computing applications. [Methods] In response to different application requirements in different fields, multi-task concatenation workflow and high-throughput application computing workflow are designed and implemented. The multi-task concatenation workflow not only realizes a general customized workflow logic scheme on the server side and client side, allowing users to design multi-task concatenation independently, but also encapsulates domain-specific workflows in the high-performance computing environment to meet more specific and proprietary requirements. The high-throughput application computing workflow improves efficiency by using multi-process concurrency and asynchronous file upload streams when tasks are independent of each other. When tasks are interrelated, batch files are generated by script, which interacts with the high-performance computing environment only once. Under the available computing resources, a two-layer master-slave mode load balancing scheme is adopted to achieve collaborative concurrency among subtasks. [Results] Compared with the common task submission method on the platform, the multi-task concatenation workflow can save users’ time up to 10 times, and the high-throughput application computing workflow can demonstrate significant advantages in terms of time consumption, ease of use, and degree of automation. [Conclusions] The scientific application platform workflow designed and implemented in this paper can meet numerous complex application requirements in a more efficient and automated way, bringing high quality high-performance computing application services to the majority of researchers.

Key words: high performance computing application services, workflows, scientific application platform