数据与计算发展前沿 ›› 2025, Vol. 7 ›› Issue (5): 123-137.

CSTR: 32002.14.jfdc.CN10-1649/TP.2025.05.010

doi: 10.11871/jfdc.issn.2096-742X.2025.05.010

• 专刊:国产算力新力量,助力科学计算应用新发展 • 上一篇    下一篇

空气质量模式EPICC-Model气相化学模块的国产加速卡移植与并行优化

曹凯1(),唐晓1,*(),陈焕盛1,马金钢2,3,吴其重4,王文丁1,陈学舜1,李锦熙1,王自发1   

  1. 1.中国科学院大气物理研究所,北京 100029
    2.中科三清科技有限公司,北京 100029
    3.马鞍山学院,安徽 马鞍山 243100
    4.北京师范大学,北京 100875
  • 收稿日期:2025-01-14 出版日期:2025-10-20 发布日期:2025-10-23
  • 通讯作者: 唐晓
  • 作者简介:曹凯,中国科学院大气物理研究所,博士后,主要研究方向为空气质量模式的高性能计算与应用。
    本文承担工作为:EPICC-Model气相化学模块在国产加速卡上的代码实现,应用测试。
    CAO Kai is a postdoctoral fellow at the Institute of Atmospheric Physics, Chinese Academy of Sciences. His main research interests include high performance computing and application of air quality models.
    In this paper, he is mainly responsible for the code implementation and application testing of the gas-phase chemistry module in the EPICC-Model on domestic accelerators.
    E-mail: caokai@mail.iap.ac.cn|唐晓,中国科学院大气物理研究所,正高级工程师,主要研究方向为空气质量模式研发,大气污染数据同化,大气污染源清单反演,大气污染模拟与预报。
    本文承担工作为:EPICC-Model气相化学模块研发指导,正确性验证。
    TANG Xiao is a senior engineer at the Institute of Atmospheric Physics, Chinese Academy of Sciences. His main research interests include the development of air quality models, atmospheric pollution data assimilation, atmospheric pollution source inventory inversion, and atmospheric pollution simulation and forecasting.
    In this paper, he is mainly responsible for guiding the development of the gas-phase chemistry module in the EPICC-Model and verifying its correctness.
    E-mail: tangxiao@mail.iap.ac.cn
  • 基金资助:
    光合基金(202302017828);中国科学院基础与交叉前沿科研先导专项课题(XDB0760401);中国科学院大气物理研究所大气环境与极端气象全国重点实验室自主课题(2024QN08);河南省重点研发专项项目(241111212300)

Porting and Parallel Optimization of the Gas-Phase Chemistry Module of the Air Quality Model EPICC-Model on China’s Domestic Accelerators

CAO Kai1(),TANG Xiao1,*(),CHEN Huansheng1,MA Jingang2,3,WU Qizhong4,WANG Wending1,CHEN Xueshun1,LI Jinxi1,WANG Zifa1   

  1. 1. Institute of Atmospheric Physics, Chinese Academy of Sciences, Beijing 100029, China
    2. 3Clear Technology Co., Ltd., Beijing 100029, China
    3. Ma’anshan University, Ma’anshan, Anhui 243100, China
    4. Beijing Normal University, Beijing 100875, China
  • Received:2025-01-14 Online:2025-10-20 Published:2025-10-23
  • Contact: TANG Xiao

摘要:

【目的】 为解决空气质量模式EPICC-Model跨架构计算的一致性和稳定性,提升其并行计算效率,本文选取EPICC-Model热点模块——气相化学模块,开展国产加速卡的移植适配和并行加速优化。【方法】 本文选用4阶隐式Rosenbrock求解器,求解化学反应动力学常微分方程组,并采用嵌入的3阶法估计误差。结合C语言和HIP异构编程模型实现气相化学模块在国产加速卡的移植适配后,进一步通过全局内存的合并访问、计算优化以及“核-卡”协同计算等措施,提升EPICC-Model在国产异构集群的计算效率和并行可拓展度。【结果】 相较于国产通用CPU处理器单核心和32核心计算耗时,气相化学单模块在国产加速卡的计算效率分别提升61.2倍和3.0倍。将其耦合至EPICC-Model后,模式计算负载不均问题有所改善,且不同“核-卡”配置下的模式总耗时均降低45%以上。【局限】 未来需进一步提升CPU处理器与加速卡间的通信效率,提升EPICC-Model在国产加速卡的代码占比,实现模式其他热点模块甚至除IO外的整个积分模块在国产加速卡的并行计算。【结论】 国产加速卡显著提升气相化学模块计算效率,为今后开展超高分辨率空气质量模拟和预报奠定基础。

关键词: 空气质量模式, EPICC-Model, 气相化学模块, 国产加速卡, 异构移植, 并行计算

Abstract:

[Objective] To address the consistency and stability of cross-architecture computing in the EPICC-Model air quality model and enhance its parallel computing efficiency, this study focuses on the model's hotspot module, namely the gas-phase chemistry module, and explores the porting and parallel optimization on domestic accelerators. [Methods] A fourth-order implicit Rosenbrock solver is used to solve the system of ordinary differential equations governing chemical reaction kinetics, with embedded third-order methods for error estimation. The porting of the gas-phase chemistry module onto domestic accelerators is implemented using C language and the HIP heterogeneous programming technology. Further optimization is achieved through strategies such as merged global memory access, computational optimizations, and “core-card” collaborative computing, aiming to improve the computational efficiency and parallel scalability of the EPICC-Model on domestic heterogeneous clusters. [Results] Compared to the elapsed time on a single-core and 32-core domestic CPU processor, the computational efficiency of the gas-phase chemistry module on the domestic accelerator increases by 61.2 times and 3.0 times, respectively. When coupled into the EPICC-Model, the issue of uneven model computational load is alleviated, and the total model computation time is reduced by more than 45% across different “core-card” configurations. [Limitations] Future work will focus on further improving the communication efficiency between the CPU processor and accelerator, increasing the code coverage of the EPICC-Model on domestic accelerators, and enabling parallel computing for other hotspot modules of the model, even the entire integration module excluding I/O. [Conclusions] The domestic accelerator significantly enhances the computational efficiency of the gas-phase chemistry module, laying a foundation for high-resolution air quality simulations and forecasts in the future.

Key words: air quality model, EPICC-Model, gas-phase chemistry module, domestic accelerators, heterogeneous porting, parallel computing