数据与计算发展前沿 ›› 2022, Vol. 4 ›› Issue (5): 3-10.

CSTR: 32002.14.jfdc.CN10-1649/TP.2022.05.001

doi: 10.11871/jfdc.issn.2096-742X.2022.05.001

• 专刊:东数西算:开启算力经济时代的世纪工程(上) • 上一篇    下一篇



  1. 1.上海超级计算中心,上海 201203
    2.上海交通大学,高性能计算中心,上海 200240
    3.长沙理工大学,计算机与通信工程学院,湖南 长沙 410114
  • 收稿日期:2022-07-11 出版日期:2022-10-20 发布日期:2022-10-27
  • 通讯作者: 寇大治
  • 作者简介:寇大治,上海超级计算中心,高级工程师,主要研究领域为高性能计算集群系统、高性能计算的应用。
    KOU Dazhi, is a senior engineer at the Shanghai Super-computer Center. His research interests include HPC cluster systems and HPC applications.
    In this paper, he is responsible for drawing up the paper fra-mework and writing: 1. Application running history database, 3. Container migration of HPC applications, 5. Conc-lusion and prospect.
    E-mail: dzkou@ssc.net.cn
  • 基金资助:

Application-Aware Method for Optimized Computing Power Scheduling

KOU Dazhi1,*(),WEI Jianwen2,TANG Xiaoyong3   

  1. 1. Shanghai Supercomputer Center, Shanghai 201203, China
    2. Center for High Performance Computing, Shanghai Jiao Tong University, Shanghai 200240, China
    3. School of Computer and Communication Engineering, Changsha University of Science and Technology, Changsha, Hunan 410114, China
  • Received:2022-07-11 Online:2022-10-20 Published:2022-10-27
  • Contact: KOU Dazhi



关键词: 高性能计算系统, 历史数据库, 应用特征, 算力调度方法


[Objective] Under the background of the project of “East-West Computing Requirement Transfer”, the super-computing resources distributed in different regions will be scheduled and managed. In order to avoid the problem of busy and unevenly distribution of computing resources, it is necessary to develop a multi-center task scheduling system by investigating the runtime characteristics of typical applications to achieve unified management of the national high-performance computing environment. [Methods] Firstly, the log data about application execution at several national supercomputing centers are collected and the database for the application log data is established. Secondly, by taking the user resources demand and the resource usage characteristics of typical applications into consideration, a machine learning framework is established to accurately depict the application execution features. Then migration of HPC applications across clusters using containers is implemented. Finally, a task scheduling system based on application-aware resource scheduling optimization is developed. [Results] This system provides powerful technical support for services and efficient operation of the national high-performance computing environment. [Conclusions] The application-aware method for computing power scheduling optimization is expected to effectively improve the reliability, availability, and maintainability of the “East-West Computing Requirement Transfer” project.

Key words: High Performance Computing system, historical database, application feature, computing power scheduling method