数据与计算发展前沿 ›› 2024, Vol. 6 ›› Issue (1): 102-112.

CSTR: 32002.14.jfdc.CN10-1649/TP.2024.01.010

doi: 10.11871/jfdc.issn.2096-742X.2024.01.010

• 技术与应用 • 上一篇    下一篇

基于容器化的快速射电暴搜寻GPU并行优化

王玉明1,2(),吴开超1,*(),牛晨辉3,张晓丽1   

  1. 1.中国科学院计算机网络信息中心,北京 100083
    2.中国科学院大学,北京 100049
    3.中国科学院国家天文台, 北京 100101
  • 收稿日期:2022-11-21 出版日期:2024-02-20 发布日期:2024-02-21
  • 通讯作者: * 吴开超(E-mail: kaichao@cnic.cn
  • 作者简介:王玉明,中国科学院计算机网络信息中心,科技云技术与应用发展部,研究生,主要研究方向为容器化与分布式系统、天文科学应用。
    本文主要承担工作为:算法分析、并行优化方案的实现、文章撰写。
    WANG Yuming is a master’s student at the Computer Science of Computer Network Information Center, Chinese Academy of Sciences. His research interests include containerization and distributed systems.
    In this paper, he is responsible for the analysis of algorithms, implementation of parallel optimization program, and article writing.
    E-mail: wangyuming192@mails.ucas.ac.cn|吴开超,中国科学院计算机网络信息中心,科技云技术与应用发展部,正高级工程师,主要研究方向为数据密集型计算、天文科学应用。
    本文中主要承担工作为:指导容器化优化方案设计与改进。
    WU Kaichao is currently a professor at the Computer Science of Computer Network Information Center, Chinese Academy of Sciences. His research interests include data-intensive computing and astronomical science applications.
    In this paper, he is responsible for the guidance of Containerization optimization design and improvement.
    E-mail: kaichao@cnic.cn
  • 基金资助:
    科技创新2030“新一代人工智能” 重大项目(2022ZD0115300)

GPU Parallel Optimization for Fast Radio Burst Search Based on Containerization

WANG Yuming1,2(),WU Kaichao1,*(),NIU Chenhui3,ZHANG Xiaoli1   

  1. 1. Computer Network Information Center, Chinese Academy of Sciences, Beiing 100083, China
    2. Chinese Academy of Sciences, Beijing 100049, China
    3. National Astronomical Observatories, Chinese Academy of Sciences, Beijing 100101, China
  • Received:2022-11-21 Online:2024-02-20 Published:2024-02-21

摘要:

【应用背景】快速射电暴(Fast Radio Burst, FRB)搜寻是500米口径球面射电望远镜(FAST)的重要科学目标之一,其计算复杂度高,数据量大,当前算法GPU利用率偏低,数据处理需较多的人工介入操作。【目的】在不修改算法实现的前提下,实现进程级GPU并行优化,提高GPU整体资源利用率,简化算法运行调度,支持利用自动化脚本驱动计算过程。【方法】利用容器化封装FRB搜寻算法,结合GPU聚合技术实现多个FRB搜寻计算容器的多进程并行,支持GPU闲时复用。通过容器化封装屏蔽了GPU调用、依赖库管理等技术细节,减少人工介入操作。【结果】算法实验结果表明,在不修改原始算法、不增加GPU资源的前提下,将单GPU绑定6个计算进程,并行优化可实现FRB搜寻算法的加速比达到5.3,并行效率达到0.88,取得良好的并行效果。【结论】基于容器化封装及进程级GPU聚合的并行优化,可实现GPU利用率及计算效率的提升,有效支持自动化处理。该方法还具有良好的通用性,可适用于类似应用的并行优化。

关键词: 快速射电暴, 容器化, 进程级并行优化, GPU聚合

Abstract:

[Context] Fast Radio Burst (FRB) search is one of the important scientific goals of the 500-meter Aperture Spherical Radio Telescope (FAST), which has high computational complexity and a large amount of raw data. The GPU utilization rate is low by the current FRB search algorithms, and data processing demands more manual intervention. [Object] The research studied the process-level GPU parallel optimization method to achieve the effects that realizing multi-process parallelism of multiple, improving GPU utilization, simplifying the operation and scheduling of algorithms, and supporting the use of automated scripts on the premise of not modifying the algorithm implementation. [Methods] The containerized-encapsulation FRB search algorithm with GPU aggregation technology is utilized to support GPU idle time multiplexing, and shield technical details, like GPU calls and dependency library management. [Results] The experimental results show that the parallel optimization program can achieve a speedup of 5.3 for the FRB search algorithm, and a superior parallel efficiency of 0.88, running on a single GPU binding to 6 computing processes, without modifying the original algorithm or increasing GPU resources. [Conclusions] It can improve GPU utilization, and computing efficiency, and support automated processing effectively by the way of parallel optimization based on containerized encapsulation and process-level GPU aggregation. Also, the program has good generality and can be applied to parallel optimization of similar applications.

Key words: fast radio burst, containerization, process-level GPU parallel optimization, GPU aggregation