数据与计算发展前沿 ›› 2025, Vol. 7 ›› Issue (2): 68-85.
CSTR: 32002.14.jfdc.CN10-1649/TP.2025.02.008
doi: 10.11871/jfdc.issn.2096-742X.2025.02.008
刘扬1,2(),许建飞1,2,许黄超1,2,吴璨1,胡泰源1,2,原惠峰1,2,高凌云1,2,梁文昊1,2,董盛1,2,马英晋1,李瑞琳1,*(
),赵永华1,*(
)
收稿日期:
2024-10-10
出版日期:
2025-04-20
发布日期:
2025-04-23
通讯作者:
李瑞琳,赵永华
作者简介:
刘扬,中国科学院计算机网络信息中心,中国科学院大学,博士研究生,CCF学生会员,主要研究方向为高性能计算和数值算法。基金资助:
LIU Yang1,2(),XU Jianfei1,2,XU Huangchao1,2,WU Can1,HU Taiyuan1,2,YUAN Huifeng1,2,GAO Lingyun1,2,LIANG Wenhao1,2,DONG Sheng1,2,MA Yingjin1,LI Ruilin1,*(
),ZHAO Yonghua1,*(
)
Received:
2024-10-10
Online:
2025-04-20
Published:
2025-04-23
Contact:
LI Ruilin,ZHAO Yonghua
摘要:
【目的】随着信息技术的快速发展和全球数据量的激增,超级计算机(超算)已经成为科学研究和创新发展的重要驱动力。本文旨在探讨超算在多个领域中的应用现状与发展趋势。【方法】通过广泛调研全球范围内的超算和领域应用情况,系统性地对相关高性能计算应用进行分类和总结,重点关注化学与材料、物理学等多个领域,探讨相关计算需求与超算的适配和部署情况。此外,本文还积极讨论了网格计算与超算互联。【结果】超算在多个领域应用已经展现出了显著的效果。随着应用领域的需要和高性能计算技术的不断发展,对超级计算机的软硬件发展也提出更高要求。【局限】虽然超算正处在蓬勃发展的阶段,可应用范围广泛,但本文仅选取了代表性应用领域进行分析总结。【结论】超算在加速科学发现和技术创新方面的效率显著提升,为未来的研究和应用提供了强有力的支持。同时,提升超算的性能和适应性将是未来科研进展的重要保障。
刘扬,许建飞,许黄超,吴璨,胡泰源,原惠峰,高凌云,梁文昊,董盛,马英晋,李瑞琳,赵永华. 基于超级计算机的高性能计算应用发展现状及趋势研究[J]. 数据与计算发展前沿, 2025, 7(2): 68-85.
LIU Yang,XU Jianfei,XU Huangchao,WU Can,HU Taiyuan,YUAN Huifeng,GAO Lingyun,LIANG Wenhao,DONG Sheng,MA Yingjin,LI Ruilin,ZHAO Yonghua. Research on the Status and Trends of High-Performance Computing Applications Development on Supercomputers[J]. Frontiers of Data and Computing, 2025, 7(2): 68-85, https://cstr.cn/32002.14.jfdc.CN10-1649/TP.2025.02.008.
表1
截至2024年6月超算最新Top500榜单前10名"
排名 | 超算名 | 部署地 | Rmax | 互联方式 | 处理器架构 |
---|---|---|---|---|---|
1 | Frontier | 橡树岭国家实验室,美国 | 1,206.00 | Slingshot-11 | AMD EPYC +AMD Instinct MI250 |
2 | Aurora | 阿贡国家实验室,美国 | 1,012.00 | Slingshot-11 | Intel Xeon + Intel Data CenterGPU Max |
3 | Eagle | 微软,美国 | 561.20 | NVIDIA Infiniband NDR | Intel Xeon + NVIDIA H100 |
4 | Supercomputer Fugaku | RIKEN,日本 | 442.01 | Tofu interconnect D | 富士通 A64FX |
5 | LUMI | CSC数据中心,芬兰 | 379.70 | Slingshot-11 | AMD EPYC +AMD Instinct MI250 |
6 | Alps | 瑞士国家超级计算中心,瑞士 | 270.00 | Slingshot-11 | NVIDIA Grace + NVIDIA GH200 |
7 | Leonardo | 博洛尼亚技术中心,意大利 | 241.20 | Quad-rail NVIDIAHDR100 Infiniband | Intel Xeon + NVIDIA A100 |
8 | MareNostrum 5 ACC | BSC-CNS,西班牙 | 175.30 | Infiniband NDR | Intel Xeon + NVIDIA H100 |
9 | Summit | 橡树岭国家实验室,美国 | 148.60 | Dual-rail Mellanox EDR Infiniband | IBM POWER9 +Nvidia V100 |
10 | Eos NVIDIA DGXSuperPOD | 英伟达公司,美国 | 121.40 | Infiniband NDR400 | Intel Xeon + NVIDIA H100 |
表2
近5年GB奖入围及获奖的高性能应用及其所属学科领域"
年份 | 项目名 | 超算集群 | 应用领域 |
---|---|---|---|
2023 | 量子精度的大规模材料建模:金属合金中的准晶体和相互作用扩展缺陷的Ab Initio模拟(*) | Frontier | 材料科学 |
迈向涡轮机械流动的百亿亿级计算 | 神威 | 流体力学 | |
通过谱元模拟探索湍流瑞利-贝纳德对流的最终状态 | LUMI, Leonardo | 流体力学 | |
代数压缩扩展多维地震处理的“记忆墙” | Cerebras CS-2 | 地球科学 | |
将深度等变模型领先精度扩展到真实尺寸的生物分子模拟 | Perlmutter | 分子生物学 | |
用于先进设计的百亿亿次级多物理场核反应堆模拟 | Frontier | 核科学技术 | |
在Frontier超算系统上运行的简化版云解析E3SM大气模型(#) | Frontier | 地球科学 | |
2022 | 基于E级超算的激光电子加速器设计研究(*) | Frontier, Summit, Perlmutter,Fugaku | 物理学 |
基于DGDFT的250万原子复杂金属异质结构从头算电子结构模拟 | 神威 | 计算化学 | |
Exaflops生物医学知识图分析 | Frontier | 生物医学工程学 | |
超大尺度环境应用中的地质统计建模与预测 | Fugaku | 地球科学 | |
具有不确定性量化的超大尺度地震模拟 | Fugaku | 地球科学 | |
超大规模的多对多蛋白质相似性搜索 | Summit | 分子生物学 | |
E级超算上的突破性网格精细粒子模拟 | Perlmutter, Summit, Fugaku | 物理力学 | |
GenSLM:基因组尺度语言模型揭示严重急性呼吸系统综合征冠状病毒2型进化动力学(#) | Polaris, Selene | 分子生物学 | |
2021 | 超大规模量子随机电路实时模拟(*) | 神威 | 量子线路模拟 |
午餐前的20微秒分子动力学模拟 | Anton 3 | 物理力学 | |
多架构大规模并行保辛结构电磁全动理学等离子体模拟 | 神威 | 流体力学 | |
千万核可扩展第一性原理曼光谱模拟 | 神威 | 计算化学 | |
碳在极端条件下的十亿原子分子动力学模拟 | Summit | 物理力学 | |
宇宙遗迹中中微子在六维相空间中的大规模分布 | Fugaku | 物理力学 | |
飞沫气溶胶感染风险评估的数字化模拟(#) | Fugaku | 分子生物学、流体力学 | |
2020 | 通过机器学习将分子动力学的从头算精度极限推到1亿个原子(*) | Summit | 物理力学 |
可扩展到136 petaflop/s的知识图谱分析 | Summit | 计算机科学 | |
在前沿高性能系统对大规模激发态GW计算加速 | Summit | 计算化学 | |
基于320亿网格元有限计算的壁面分辨大涡模拟拖曳舱数值试验的实现 | Fugaku | 流体力学 | |
全尺寸平方公里阵列数据处理 | Summit | 地球科学 | |
基于3.5 km网格全球天气模拟的1024成元数据同化研究 | Fugaku | 地球科学 | |
人工智能驱动的多尺度模拟揭示了严重急性呼吸系统综合征冠状病毒2型刺突动力学的机制(#) | Summit | 生物医药 | |
2019 | 以数据为中心的极端尺度从头算耗散量子输运模拟方法(*) | Summit, PizDaint | 材料科学 |
使用混合精度计算快速、可扩展和精确的基于有限元的从头计算:金属位错系统的46 PFLOPS模拟 | Summit | 材料科学 |
表3
主要领域在超算上的应用和适配情况"
所属领域 | 相关应用 | 应用核心求解方程 | 应用计算行为 | 应用计算热点 | 应用软件 |
---|---|---|---|---|---|
化学与材料科学 | 量子化学/分子原子电子结构 | 薛定谔方程等 | 单、少量节点并行 | 双电子积分求解等 | Gaussian、NWChem等 |
分子模拟/动力学 | 牛顿方程等 | 单、多节点并行 | 力场计算等 | LAMMPS、GROMACS等 | |
材料模拟/第一性原理计算 | 本构关系等 | 单、少量节点并行 | 格点计算等 | VASP等 | |
物理学与力学 | 流体力学 | Navier-Stokes方程等 | 单、多节点并行 | PDE方程离散和求解等 | Fluent, SWSPH、OpenCFD等 |
高能物理/量子场论 | 量子场论方程等 | 单、多节点并行 | 非线性方程组求解等 | Quda、Geant4等 | |
生物学、药学与生物工程 | 生物医药 | 自由能方程等 | 单、多节点并行 | 力场计算、采样算法等 | AutoDock GPU等 |
基因组序列分析 | 比对算法等 | 单、多节点并行 | 序列比对和变异分析等 | BWA-MEM、HISAT2等 | |
蛋白结构预测 | 打分函数等 | 单、多节点并行 | 蛋白质折叠预测等 | Rosetta、AlphaFold2等 | |
地球科学 | 矿产及油气资源勘探 | 多项式拟合等 | 多节点并行 | 聚类分析等 | GeoEast等 |
大气-海洋分析与模式 | 耦合模型、谱分析等 | 多节点并行 | 数据模拟和预测分析等 | SCREAM、LICOM等 | |
其他 | 量子线路模拟 | 张量收缩等 | 单、多节点并行 | 逻辑线路和量子态演变等 | SWQSIM等 |
人工智能与模型训练 | 卷积计算等 | 单、多节点并行 | 模型训练等 | GPT系列等 |
[1] | NATIONAL RESEARCH COUNCIL. The Future of Computing Performance: Game Over or Next Level?[M]. Washington, DC: The National Academies Press. 2011. |
[2] | TOP500.ORG.Top 500[EB/OL]. [2024-10-9]. https://www.top500.org/. |
[3] | PETITET A, WHALEY R C, DONGARRA J, et al. HPL-A Portable Implementation of the High-Performance Linpack Benchmark for Distributed-Memory Computers[EB/OL]. [2024-10-9]. https://www.netlib.org/benchmark/hpl/index.html. |
[4] | ACM, Inc. ACM Gordon Bell Prize[EB/OL]. [2024-10-9]. https://awards.acm.org/bell/award-recipients. |
[5] | TROTT C R, LEBRUN-GRANDIÉ D, ARNDT D, et al. Kokkos 3: Programming model extensions for the exascale era[J]. IEEE Transactions on Parallel and Distributed Systems, 2021, 33(4): 805-817. |
[6] | BECKINGSALE D A, BURMARK J, HORNUNG R, et al. RAJA: Portable performance for large-scale scientific applications[C]// 2019 ieee/acm international workshop on performance, portability and productivity in hpc (p3hpc). IEEE, 2019: 71-81. |
[7] | BELL N, HOBEROCK J. Thrust: A productivity-oriented library for CUDA[M]// GPU computing gems Jade edition. Morgan Kaufmann, 2012: 359-371. |
[8] | KALE L V, KRISHNAN S. Charm++ a portable concurrent object oriented system based on c++[C]// Proceedings of the eighth annual conference on Object-oriented programming systems, languages, and applications. 1993: 91-108. |
[9] | Oak Ridge Leadership Computing Facility. Oak Ridge Leadership Computing Facility[EB/OL]. [2024-10-9]. https://www.olcf.ornl.gov/. |
[10] | CSCS. CSCS annual-reports[EB/OL]. [2024-10-9]. https://www.cscs.ch/publications/annual-reports. |
[11] | ARGONNE NATIONAL LABORATORY. Argonne Leadership Computing Facility[EB/OL]. [2024-10-9]. https://www.alcf.anl.gov/science/projects. |
[12] | MAIA J D C, URQUIZA C G A, MANGUEIRA J R C P, et al. GPU linear algebra libraries and GPGPU programming for accelerating MOPAC semiempirical quantum chemistry calculations[J]. Journal of Chemical Theory and Computation, 2012, 8(9): 3072-3081. |
[13] | YU V W, GOVONI M. GPU acceleration of large-scale full-frequency GW calculations[J]. Journal of Chemical Theory and Computation, 2022, 18(8): 4690-4707. |
[14] | CHEN H, MAIA J D C, RADAK B K, et al. Boosting free-energy perturbation calculations with GPU-accelerated NAMD[J]. Journal of Chemical Information and Modeling, 2020, 60(11): 5301-5307. |
[15] | YASUDA K. Two-electron integral evaluation on the graphics processor unit[J]. Journal of Computational Chemistry, 2008, 29(3): 334-342. |
[16] | KUSSMANN J, OCHSENFELD C. Hybrid CPU/GPU integral engine for strong-scaling ab initio methods[J]. Journal of Chemical Theory and Computation, 2017, 13(7): 3153-3159. |
[17] | MANATHUNGA M, AKTULGA H M, GÖTZ A W, et al. Quantum mechanics/molecular mechanics simulations on NVIDIA and AMD graphics processing units[J]. Journal of Chemical Information and Modeling, 2023, 63(3): 711-717. |
[18] | MCINTOSH-SMITH S, PRICE J, DEAKIN T, et al. A performance analysis of the first generation of HPC-optimized Arm processors[J]. Concurrency and Computation: Practice and Experience, 2019, 31(16): e5110. |
[19] | PAN Q, ABDULAH S, GENTON M G, et al. GPU-Accelerated Vecchia Approximations of Gaussian Processes for Geospatial Data using Batched Matrix Computations[C]//ISC High Performance 2024 Research Paper Proceedings (39th International Conference). Prometeus GmbH, 2024: 1-12. |
[20] | HU S, WU P, CAO W, et al. VASP porting and parallel optimization on GPU like accelerator[C]// 3rd International Conference on Applied Mathematics, Modelling, and Intelligent Computing (CAMMIC 2023). SPIE, 2023, 12756: 601-607. |
[21] | KOWALSKI K, BAIR R, BAUMAN N P, et al. From NWChem to NWChemEx: Evolving with the computational chemistry landscape[J]. Chemical Reviews, 2021, 121(8): 4962-4998. |
[22] | SERITAN S, BANNWARTH C, FALES B S, et al. TeraChem: A graphical processing unit-accelerated electronic structure package for large-scale ab initio molecular dynamics[J]. Wiley Interdisciplinary Reviews: Computational Molecular Science, 2021, 11(2): e1494. |
[23] | DE JONG W A, BYLASKA E, GOVIND N, et al. Utilizing high performance computing for chemistry: parallel computational chemistry[J]. Physical Chemistry Chemical Physics, 2010, 12(26): 6896-6920. |
[24] | RINKEVICIUS Z, LI X, VAHTRAS O, et al. VeloxChem: A Python-driven density-functional theory program for spectroscopy simulations in high-performance computing environments[J]. Wiley Interdisciplinary Reviews: Computational Molecular Science, 2020, 10(5): e1457. |
[25] | SIMONS J. Why is quantum chemistry so complicated?[J]. Journal of the American Chemical Society, 2023, 145(8): 4343-4354. |
[26] | CHEN C, NGUYEN D T, LEE S J, et al. Accelerating Computational Materials Discovery with Machine Learning and Cloud High-Performance Computing: from Large-Scale Screening to Experimental Validation[J]. Journal of the American Chemical Society, 2024, 146(29): 20009-20018. |
[27] | PARDAKHTI M, MOHARRERI E, WANIK D, et al. Machine learning using combined structural and chemical descriptors for prediction of methane adsorption performance of metal organic frameworks (MOFs)[J]. ACS Combinatorial Science, 2017, 19(10): 640-645. |
[28] | HU W, AN H, GUO Z, et al. 2.5 million-atom ab initio electronic-structure simulation of complex metallic heterostructures with DGDFT[C]// SC22:International Conference for High Performance Computing,Networking, Storage and Analysis. IEEE, 2022: 1-13. |
[29] | JIA W, WANG H, CHEN M, et al. Pushing the limit of molecular dynamics with ab initio accuracy to 100 million atoms with machine learning[C]// SC20:International conference for high performance computing, networking, storage and analysis. IEEE, 2020: 1-14. |
[30] | DAS S, KANUNGO B, SUBRAMANIAN V, et al. Large-scale materials modeling at quantum accuracy: Ab initio simulations of quasicrystals and interacting extended defects in metallic alloys[C]// Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. 2023: 1-12. |
[31] | 国家自然科学基金委, 中国科学院. 中国学科发展战略: 计算物理学[M]. 北京: 科学出版社, 2022. |
[32] | THOMPSON A P, AKTULGA H M, BERGER R, et al. LAMMPS-a flexible simulation tool for particle-based materials modeling at the atomic, meso, and continuum scales[J]. Computer Physics Communications, 2022, 271: 108171. |
[33] | HUTTER J, IANNUZZI M, SCHIFFMANN F, et al. cp2k: atomistic simulations of condensed matter systems[J]. Wiley Interdisciplinary Reviews: Computational Molecular Science, 2014, 4(1): 15-25. |
[34] | 张来平, 邓小刚, 何磊, 等. E级计算给CFD带来的机遇与挑战[J]. 空气动力学学报, 2016, 34(4): 405-417. |
[35] | 党冠麟. GPU加速的超大规模可压缩湍流直接数值模拟研究[D]. 北京: 中国科学院大学, 2022. |
[36] | 张新昕, 刘夏真, 梁姗, 等. 高性能并行CFD软件研发及高速列车气动性能预示[J]. 数据与计算发展前沿, 2023, 5(2): 106-118. |
[37] | KATO C, YAMADE Y, NAGANO K, et al. Toward realization of numerical towing-tank tests by wall-resolved large eddy simulation based on 32 billion grid finite-element computation[C]// SC20:International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, 2020: 1-13. |
[38] | FU Y, SHEN W, CUI J, et al. Towards Exascale Computation for Turbomachinery Flows[C]// SC23:International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE Computer Society, 2023: 1-12. |
[39] | HANDA Y, OKUWAKI K, KAWASHIMA Y, et al. Prediction of Binding Pose and Affinity of Nelfinavir, a SARS-CoV-2 Main Protease Repositioned Drug, by Combining Docking, Molecular Dynamics, and Fragment Molecular Orbital Calculations[J]. The Journal of Physical Chemistry B, 2024, 128(10): 2249-2265. |
[40] | 张宝花, 李辉, 刘倩, 等. 超大规模药物虚拟筛选的实现与应用[J]. 计算机科学与探索, 2023, 17(5): 1049-1056. |
[41] | LI Z, LI X, HUANG Y Y, et al. Identify potent SARS-CoV-2 main protease inhibitors via accelerated free energy perturbation-based virtual screening of existing drugs[J]. Proceedings of the National Academy of Sciences, 2020, 117(44): 27381-27387. |
[42] | KOZINSKY B, MUSAELIAN A, JOHANSSON A, et al. Scaling the leading accuracy of deep equivariant models to biomolecular simulations of realistic size[C]// Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis,2023: 1-12. |
[43] | YUAN Q, TIAN C, SONG Y, et al. GPSFun: geometry-aware protein sequence function predictions with language models[J]. Nucleic Acids Research, 2024: gkae381. |
[44] | YANG X, FU L, DENG Y, et al. GPMO: Gradient Perturbation-Based Contrastive Learning for Molecule Optimization[C]// IJCAI. 2023: 4940-4948. |
[45] | PUCKELWARTZ M J, PESCE L L, NELAKUDITI V, et al. Supercomputing for the parallelization of whole genome analysis[J]. Bioinformatics, 2014, 30(11): 1508-1513. |
[46] | GOSWAMI S, LEE K, SHAMS S, et al. Gpu-accelerated large-scale genome assembly[C]// 2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS). IEEE, 2018: 814-824. |
[47] | KALLENBORN F, CASCITTI J, SCHMIDT B. CARE 2.0: reducing false-positive sequencing error corrections using machine learning[J]. BMC Bioinformatics, 2022, 23(1): 227. |
[48] | NIU B, SCOTT A D, SENGUPTA S, et al. Protein-structure-guided discovery of functional mutations across 19 cancer types[J]. Nature Genetics, 2016, 48(8): 827-837. |
[49] | PARK H, PATEL P, HAAS R, et al. APACE: AlphaFold2 and advanced computing as a service for accelerated discovery in biophysics[J]. Proceedings of the National Academy of Sciences, 2024, 121(27): e2311888121. |
[50] | LUPALA C S, LI X, LEI J, et al. Computational simulations reveal the binding dynamics between human ACE2 and the receptor binding domain of SARS-CoV-2 spike protein[J]. Quantitative Biology, 2021, 9(1): 61-72. |
[51] | SELVITOPI O, EKANAYAKE S, GUIDI G, et al. Extreme-scale many-against-many protein similarity search[C]//SC22:International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, 2022: 1-12. |
[52] | KANNAN R, SAO P, LU H, et al. Exaflops biomedical knowledge graph analytics[C]// SC22:International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, 2022: 1-11. |
[53] | 谢经纬, 刘海龙, 郑伟鹏, 等. 全球海洋环流模式研究进展[J]. 地球科学进展, 2024, 39(5): 454-465. |
[54] | HEWITT H T, ROBERTS M, MATHIOT P, et al. Resolving and parameterising the ocean mesoscale in earth system models[J]. Current Climate Change Reports, 2020, 6: 137-152. |
[55] | ZHANG S, XU S, FU H, et al. Toward Earth system modeling with resolved clouds and ocean submesoscales on heterogeneous many-core HPCs[J]. National Science Review, 2023, 10(6): nwad069. |
[56] | DONG J, FOX-KEMPER B, ZHANG H, et al. The seasonality of submesoscale energy production, content, and cascade[J]. Geophysical Research Letters, 2020, 47(6): e2020GL087388. |
[57] | WANG P, JIANG J, LIN P, et al. The GPU version of LASG/IAP Climate System Ocean Model version 3 (LICOM3) under the heterogeneous-compute interface for portability (HIP) framework and its large-scale application[J]. Geoscientific Model Development, 2021, 14(5): 2781-2799. |
[58] | TAYLOR M, CALDWELL P M, BERTAGNA L, et al. The simple cloud-resolving e3sm atmosphere model running on the frontier exascale system[C]// Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis,2023: 1-11. |
[59] | ARBIC B K. Incorporating tides and internal gravity waves within global ocean general circulation models: A review[J]. Progress in Oceanography, 2022,206: 102824. |
[60] | DANILOV S, SIDORENKO D, WANG Q, et al. The Finite-volumE Sea ice-Ocean Model (FESOM2)[J]. Geoscientific Model Development, 2017, 10: 765-789. |
[61] | KORN P, BRÜGGEMANN N, JUNGCLAUS J H, et al. Icon-o: The ocean component of the icon earth system model—Global simulation characteristics and local telescoping capability[J]. Journal of Advances in Modeling Earth Systems, 2022, 14(10): e2021M- S002952. |
[62] | DEE D P, UPPALA S M, SIMMONS A J, et al. The ERA-Interim reanalysis: Configuration and performance of the data assimilation system[J]. Quarterly Journal of the royal meteorological society, 2011, 137(656): 553-597. |
[63] | ONOGI K, TSUTSUI J, KOIDE H, et al. The JRA-25 reanalysis[J]. Journal of the Meteorological Society of Japan. Ser. II, 2007, 85(3): 369-432. |
[64] | LIU Z, JIANG L, SHI C, et al. CRA-40/Atmosphere—The first-generation Chinese Atmospheric reanalysis (1979-2018): System description and performance evaluation[J]. Journal of Meteorological Research, 2023, 37(1): 1-19. |
[65] | PERERA M S A. A review of underground hydrogen storage in depleted gas reservoirs: Insights into various rock-fluid interaction mechanisms and their impact on the process integrity[J]. Fuel, 2023, 334: 126677. |
[66] | HEINECKE A, BREUER A, RETTENBERGER S, et al. Petascale high order dynamic rupture earthquake simulations on heterogeneous supercomputers[C]// SC'14:Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, 2014: 3-14. |
[67] | SHEN J, REN X, ZHANG Y, et al. Nonlinear dynamic analysis of frame-core tube building under seismic sequential ground motions by a supercomputer[J]. Soil Dynamics and Earthquake Engineering, 2019, 124: 86-97. |
[68] | CUI Z, CHEN Q, LIU G, et al. Hybrid parallel framework for multiple-point geostatistics on Tianhe-2: A robust solution for large-scale simulation[J]. Computers & Geosciences, 2021, 157: 104923. |
[69] | WAN W, GAN L, WANG W, et al. 69.7-PFlops Extreme Scale Earthquake Simulation with Crossing Multi-faults and Topography on Sunway[C]// Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 2023: 1-15. |
[70] | LTAIEF H, HONG Y, WILSON L, et al. Scaling the “memory wall” for multi-dimensional seismic processing with algebraic compression on cerebras cs-2 systems[C]// Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 2023: 1-12. |
[71] | ARUTE F, ARYA K, BABBUSH R, et al. Quantum supremacy using a programmable superconducting processor[J]. Nature, 2019, 574(7779): 505-510. |
[72] | ISAEV M, MCDONALD N, DENNISON L, et al. Calculon: a methodology and tool for high-level co-design of systems and large language models[C]// Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 2023: 1-14. |
[73] | PAN F, ZHOU P, LI S, et al. Contracting arbitrary tensor networks: general approximate algorithm and applications in graphical models and quantum circuit simulations[J]. Physical Review Letters, 2020, 125(6): 060503. |
[74] | LIU Y, LIU X, LI F, et al. Closing the “quantum supremacy” gap: achieving real-time simulation of a random quantum circuit using a new sunway supercomputer[C]// Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 2021: 1-12. |
[75] | 吕品, 苑涛. 量子计算云平台的应用生态建设和发展建议[J]. 信息通信技术与政策, 2024, 50(7): 18-23. |
[76] | NARAYANAN D, SHOEYBI M, CASPER J, et al. Efficient large-scale language model training on gpu clusters using megatron-lm[C]// Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 2021: 1-15. |
[77] | AMINABADI R Y, RAJBHANDARI S, AWAN A A, et al. Deepspeed-inference: enabling efficient inference of transformer models at unprecedented scale[C]//SC22:International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, 2022: 1-15. |
[78] | ZHANG M, CHEN H, SHEN C, et al. LoRAPrune: Structured Pruning Meets Low-Rank Parameter-Efficient Fine-Tuning[C]// Findings of the Association for Computational Linguistics ACL 2024, 2024: 3013-3026. |
[79] | REN J, RAJBHANDARI S, AMINABADI R Y, et al. Zero-offload: Democratizing billion-scale model training[C]// 2021 USENIX Annual Technical Conference (USENIX ATC 21), 2021: 551-564. |
[80] | CHEN W, MO Z, XU H, et al. Interference-aware Multiplexing for Deep Learning in GPU Clusters: A Middleware Approach[C]// Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis,2023: 1-15. |
[81] | DING Q, ZHENG P, KUDARI S, et al. Mirage: Towards Low-interruption Services on Batch GPU Clusters with Reinforcement Learning[C]// Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis,2023: 1-13. |
[82] | YU J, PRABHU K, URMAN Y, et al. 8-bit Transformer Inference and Fine-tuning for Edge Accelerators[C]// Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024: 5-21. |
[83] | KUZMIN A, VAN BAALEN M, REN Y, et al. Fp8 quantization: The power of the exponent[J]. Advances in Neural Information Processing Systems, 2022, 35: 14651-14662. |
[84] | TAVARES R, TESORO A, KENNEDY A, et al. Introducing the Partnership for Advanced Computing in Europe-PRACE[C]// ICCS 2015:International Conference on Computational Science, 2015. |
[85] | NEWHOUSE S J, BREWER S. EGI: an open e-infrastructure ecosystem for the digital european research area and the humanities[C]// Progress in Cultural Heritage Preservation:4th International Conference, EuroMed 2012, Limassol, Cyprus, October 29-November 3, 2012. Proceedings 4. Springer Berlin Heidelberg, 2012: 849-856. |
[86] | KRANZLMÜLLER D, DE LUCAS J M, ÖSTER P. The European Grid Initiative (EGI) Towards a Sustainable Grid Infrastructure[C]// Remote Instrumentation and Virtual Laboratories: Service Architecture and Networking. Springer US, 2010: 61-66. |
[87] | CHARU C, RAFINSKI R. Building a Global Compute Grid—Two Examples Using the Sun™ ONE Grid Engine and the Globus Toolkit[EB/OL]. [2024-10-9]. https://www.man.poznan.pl/coe/documents/blueprint_April2003.pdf. |
[88] | KERYELL R, REYES R, HOWES L. Khronos SYCL for OpenCL: a tutorial[C]// Proceedings of the 3rd International Workshop on OpenCL, 2015: 1. |
[89] | NOZAL R, BOSQUE J L. Exploiting co-execution with oneAPI: heterogeneity from a modern perspective[C]// Euro-Par 2021: Parallel Processing: 27th International Conference on Parallel and Distributed Computing, Lisbon.Springer International Publishing, 2021: 501-516. |
[90] | ABDELFATTAH A, BEAMS N, CARSON R, et al. MAGMA: Enabling exascale performance with accelerated BLAS and LAPACK for diverse GPU architectures[J]. The International Journal of High Performance Computing Applications. 2024; 38(5): 468-490. |
[91] | ADVANCED MICRO DEVICES INC. HIPIFY documentation[EB/OL]. [2024-10-9]. https://rocm.docs.amd.com/_/downloads/HIPIFY/en/latest/pdf/. |
[92] | 历军. 建设国家超算互联网促进数字经济发展.软件和集成电路[J], 2023(9): 52-53. |
[93] | RAUCCI U, WEIR H, SAKSHUWONG S, et al. Interactive quantum chemistry enabled by machine learning, graphical processing units, and cloud computing[J]. Annual Review of Physical Chemistry, 2023, 74(1): 313-336. |
[94] | MÜLLER C H, REIHER M, KAPUR M. Embodied preparation for learning basic quantum chemistry: A mixed-method study[J]. Journal of Computer Assisted Learning, 2024, 40(2): 715-730. |
[95] | KEITH J A, VASSILEV-GALINDO V, CHENG B, et al. Combining machine learning and computational chemistry for predictive insights into chemical systems[J]. Chemical Reviews, 2021, 121(16): 9816-9872. |
[96] | MA Y, LI Z Y, CHEN X, et al. Machine-learning assisted scheduling optimization and its application in quantum chemical calculations[J]. Journal of Computational Chemistry, 2023, 44(12): 1174-1188. |
[97] | YUAN K, ZHOU S, LI N, et al. Fault-tolerant quantum chemical calculations with improved machine-learning models[J]. Journal of Computational Chemistry, 2024, 45(31): 2640-2658. |
[98] | SCHÜTT K T, SAUCEDA H E, KINDERMANS P J, et al. Schnet-a deep learning architecture for molecules and materials[J]. The Journal of Chemical Physics, 2018, 148(24). |
[99] | FU L, WU Y, SHANG H, et al. Transformer-Based Neural-Network Quantum State Method for Electronic Band Structures of Real Solids[J]. Journal of Chemical Theory and Computation, 2024, 20(14): 6218-6226. |
[1] | 卢莎莎,牛铁,吴璨,康乐,肖海力. 面向地球大数据的新型计算系统设计与实践[J]. 数据与计算发展前沿, 2025, 7(2): 40-48. |
[2] | 吴璨, 肖海力, 王小宁, 卢莎莎, 和荣. 面向高性能计算环境的智能任务编排架构研究[J]. 数据与计算发展前沿, 2025, 7(1): 99-107. |
[3] | 张云泉,袁良,袁国兴,李希代. 2024年中国高性能计算机发展现状分析与展望[J]. 数据与计算发展前沿, 2024, 6(6): 1-9. |
[4] | 纪鹏,牛铁,危婷,彭亮. 基于XGBoost模型的超算作业运行状态预测研究[J]. 数据与计算发展前沿, 2024, 6(6): 123-129. |
[5] | 武傲, 李天颜, 张宝花, 徐顺, 刘倩. 基于高性能计算环境的科学应用平台工作流设计与实现[J]. 数据与计算发展前沿, 2024, 6(4): 150-162. |
[6] | 陈晔峰, 晏臣, 陈锋, 安卫士, 何明扬. 基于鲲鹏处理器的WRF移植与评估[J]. 数据与计算发展前沿, 2024, 6(3): 150-161. |
[7] | 赵一宁, 肖海力. 国家高性能计算环境运行状态诊断系统[J]. 数据与计算发展前沿, 2024, 6(1): 57-67. |
[8] | 张浩源, 马文鹏, 袁武, 张鉴, 陆忠华. 面向GPU架构的CCFD-KSSolver组件设计和实现[J]. 数据与计算发展前沿, 2024, 6(1): 68-78. |
[9] | 张云泉, 袁良, 袁国兴, 李希代. 2023年中国高性能计算机发展现状分析与展望[J]. 数据与计算发展前沿, 2023, 5(6): 1-8. |
[10] | 杨晨柳, 方安, 王蕾, 王茜, 钱庆. 我国生物医学领域高性能计算发展分析与建议[J]. 数据与计算发展前沿, 2023, 5(6): 104-114. |
[11] | 危婷, 彭亮, 牛铁, 张宏海. 基于特征分析的HPC失败作业的检测和根因分析[J]. 数据与计算发展前沿, 2023, 5(6): 94-103. |
[12] | 张新昕,刘夏真,梁姗,张鉴,陆忠华,高凌云,张浩源. 高性能并行CFD软件研发及高速列车气动性能预示[J]. 数据与计算发展前沿, 2023, 5(2): 106-118. |
[13] | 杨雪莹, 李晨, 陈逸东, 陆忠华. 基于数值方法的养老目标基金的模型与算法综述[J]. 数据与计算发展前沿, 2023, 5(1): 85-96. |
[14] | 张云泉, 袁良, 袁国兴, 李希代. 2022年中国高性能计算机发展现状分析与展望[J]. 数据与计算发展前沿, 2022, 4(6): 3-12. |
[15] | 寇大治, 韦建文, 唐小勇. 应用感知的算力优化调度方法[J]. 数据与计算发展前沿, 2022, 4(5): 3-10. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||