Frontiers of Data and Computing ›› 2024, Vol. 6 ›› Issue (1): 102-112.

CSTR: 32002.14.jfdc.CN10-1649/TP.2024.01.010

doi: 10.11871/jfdc.issn.2096-742X.2024.01.010

• Technology and Application • Previous Articles     Next Articles

GPU Parallel Optimization for Fast Radio Burst Search Based on Containerization

WANG Yuming1,2(),WU Kaichao1,*(),NIU Chenhui3,ZHANG Xiaoli1   

  1. 1. Computer Network Information Center, Chinese Academy of Sciences, Beiing 100083, China
    2. Chinese Academy of Sciences, Beijing 100049, China
    3. National Astronomical Observatories, Chinese Academy of Sciences, Beijing 100101, China
  • Received:2022-11-21 Online:2024-02-20 Published:2024-02-21

Abstract:

[Context] Fast Radio Burst (FRB) search is one of the important scientific goals of the 500-meter Aperture Spherical Radio Telescope (FAST), which has high computational complexity and a large amount of raw data. The GPU utilization rate is low by the current FRB search algorithms, and data processing demands more manual intervention. [Object] The research studied the process-level GPU parallel optimization method to achieve the effects that realizing multi-process parallelism of multiple, improving GPU utilization, simplifying the operation and scheduling of algorithms, and supporting the use of automated scripts on the premise of not modifying the algorithm implementation. [Methods] The containerized-encapsulation FRB search algorithm with GPU aggregation technology is utilized to support GPU idle time multiplexing, and shield technical details, like GPU calls and dependency library management. [Results] The experimental results show that the parallel optimization program can achieve a speedup of 5.3 for the FRB search algorithm, and a superior parallel efficiency of 0.88, running on a single GPU binding to 6 computing processes, without modifying the original algorithm or increasing GPU resources. [Conclusions] It can improve GPU utilization, and computing efficiency, and support automated processing effectively by the way of parallel optimization based on containerized encapsulation and process-level GPU aggregation. Also, the program has good generality and can be applied to parallel optimization of similar applications.

Key words: fast radio burst, containerization, process-level GPU parallel optimization, GPU aggregation