数据与计算发展前沿 ›› 2020, Vol. 2 ›› Issue (1): 85-92.

doi: 10.11871/jfdc.issn.2096-742X.2020.01.007

所属专题: “高性能与高通量计算及应用”专刊

• 专刊:高性能与高通量计算及应用 • 上一篇    下一篇

适合一类复杂异构超算系统的HPL并行计算模型研究

赵海涛,孙家昶,黎雷生,杨文浩,赵慧,李会元()   

  1. 中国科学院软件研究所,并行软件与计算科学实验室,北京 100190
  • 收稿日期:2019-12-18 出版日期:2020-02-20 发布日期:2020-06-04
  • 通讯作者: 李会元
  • 作者简介:赵海涛,中国科学院软件研究所,博士,副研究员,主要研究方向为高性能工程与科学计算。
    本文承担工作为:并行计算模型建立与分析。
    Zhao Haitao, Ph.D., is an associate researcher at Institute of Software, Chinese Academy of Sciences. His main research direction is high performance engineering and scientific computing.
    In this paper he undertakes the following tasks: establishment and analysis of parallel computing model.
    E-mail: haitao@iscas.ac.cn|孙家昶,中国科学院软件研究所,博士,首席研究员,主要研究方向为高性能计算。
    本文承担工作为:并行计算模型研究。
    Sun Jiachang, Ph.D., is a chair professor at Institute of Software, Chinese Academy of Sciences. His main research direction is high performance computing.
    In this paper he undertakes the following tasks: research on parallel computing model.
    E-mail: jiachang@iscas.ac.cn|黎雷生,中国科学院软件研究所,博士,副研究员,主要研究方向为大规模并行科学与工程计算和异构加速计算。
    本文承担工作为:HPL性能优化。
    Li Leisheng, Ph.D., is an associate researcher at Institute of Software, Chinese Academy of Sciences. His main research direction is large-scale parallel scientific and engineering computing and heterogeneous accelerated computing.
    In this paper he undertakes the following tasks: HPL performance optimization.
    E-mail: leisheng@iscas.ac.cn|杨文浩,中国科学院软件研究所,硕士,实习研究员,主要研究方向为高性能计算。
    本文承担工作为:HPL性能优化。
    Yang Wenhao, Master, is an intern researcher at Institute of Software, Chinese Academy of Sciences. His main research direction is high performance computing.
    In this paper he undertakes the following tasks: HPL performance optimization.
    E-mail: wenhao@iscas.ac.cn|赵慧,中国科学院软件研究所,博士,助理研究员,主要研究方向为高性能工程与科学计算。
    本文承担工作为:HPL性能测试。
    Zhao Hui, Ph.D., is an assistant researcher at Institute of Software, Chinese Academy of Sciences. Her main research direction is high performance engineering and scientific computing.
    In this paper he undertakes the following tasks: HPL performance test.
    E-mail: zhaohui2016@iscas.ac.cn|李会元,中国科学院软件研究所,博士,研究员,主要研究方向为工程与科学计算。
    本文承担工作为:并行计算模型研究。
    Li Huiyuan, Ph.D., is a researcher at Institute of Software, Chinese Academy of Sciences. His main research direction is engineering and scientific computing.
    In this paper he undertakes the following tasks: research on parallel computing model.
  • 基金资助:
    国家重点研发计划高性能计算重点专项“适应于E级计算的可计算物理建模与新型计算方法”(2016YFB02006010);国家重点研发计划高性能计算专项(2018YFB0204404);中国科学院战略性先导科技专项(XDC01030200)

Research on HPL Parallel Computing Model for a Class of Complex Heterogeneous Supercomputer System

Haitao Zhao,Jiachang Sun,Leisheng Li,Wenhao Yang,Hui Zhao,Huiyuan Li()   

  1. Laboratory of Parallel Software and Computational Science, Institute of Software, Chinese Academy of Sciences,Beijing 100190, China
  • Received:2019-12-18 Online:2020-02-20 Published:2020-06-04
  • Contact: Huiyuan Li

摘要:

【目的】为快速分析超算系统性能,加速HPL基准测试优化,本文分析了HPL主要影响因素,建立了相关并行计算模型。【方法】基于曙光先进计算系统HPL基准测试程序并行优化,采用理论分析与实验验证相结合的方法,分别对HPL效率上限、快速预测、不同参数影响等问题进行分析,建立了相应的并行计算模型。【结果】与曙光先进计算系统测试结果进行对比,预测结果与实测结果吻合较好,表明了计算性能与任务的均衡度、矩阵操作占HPL计算比率、矩阵操作效率、矩阵操作库函数利用率以及网络传输等能够较大程度反映超算系统HPL的计算效率,加速卡的矩阵操作效率与HPL的效率成正比关系。【局限】目前并行计算模型考虑因素还不全面,大规模计算系统稳定性带来的性能影响还需要进一步研究。【结论】基于不同预测需求的并行计算模型,对HPL基准测试性能预测、并行优化具有重要的指导意义。

关键词: HPL, 并行计算模型, 异构系统, 计算效率, 预测

Abstract:

[Objective] In order to quickly analyze the performance of the supercomputing system and accelerate the optimization of HPL benchmark tests, this paper analyzes the main influencing factors of HPL and establishes a related parallel computing model. [Methods] Based on the parallel optimization test results of the Sugon advanced computing system HPL benchmark, the method of combining theoretical analysis and experimental verification is used to analyze the HPL efficiency upper limit, fast prediction, and influence of different parameters, on which the corresponding parallel calculations model is established. [Results] Compared with the test results of the Sugon advanced computing system, the prediction results are in good agreement with the actual measurement results, indicating the balance between factors such as computing performance and tasks, the ratio of matrix operations to HPL calculation, the efficiency of matrix operations, the utilization of matrix operation library functions, network transmission and so on can largely reflect the calculation efficiency of the HPL of the supercomputing system. Besides, the matrix operation efficiency of the acceleration card is directly proportional to the efficiency of the HPL. [Limitations] At present, the design of parallel computing models are not comprehensively considered, and how the stability requirements of a large-scale computing system affects its performance needs further studies. [Conclusions] Parallel computing models based on different forecasting requirements have important guiding significance for HPL benchmark performance prediction and parallel optimization.

Key words: HPL, parallel computing models, heterogeneous system, computing efficiency, prediction