Frontiers of Data and Computing ›› 2021, Vol. 3 ›› Issue (3): 136-147.

doi: 10.11871/jfdc.issn.2096-742X.2021.03.012

• Technology and Applicaton • Previous Articles     Next Articles

An Effective Implementation of LDLT Decomposition of Sparse Symmetric Matrix on GPU

Chen Xinfeng1,2(),Wang Wu1,*()   

  1. 1. Computer Network Information Center, Chinese Academy of Sciences, Beijing 100190, China
    2. University of Chinese Academy of Sciences, Beijing 100049, China
  • Received:2021-02-01 Online:2021-06-20 Published:2021-07-09
  • Contact: Wang Wu E-mail:chenxinfeng@cnic.cn;wangwu@sccas.cn

Abstract:

[Objective] LDLT decomposition is an effective tool to solve many problems in sparse symmetric linear systems, especially for those problems which are hard to converge using iterative solvers. However, it is difficult to implement LDLT on the GPU for data dependency and irregular data access during the factorization. [Methods] In this paper, an effective GPU-based LDLT decomposition method of sparse symmetric matrix is designed and implemented based on Cholesky symbolic decomposition, right-looking decomposition algorithm and level partition of the dependency graph for the sparse matrix. By using controlled kernel launch for CUDA dynamic parallelism, all three loops of the algorithm are parallelized, so the proposed method can achieve higher parallelism.[Results] Experimental results show that the implementation of LDLT on GPU can achieve a maximum speedup of 46.2 compared to UMFPACK for a typical collection of sparse symmetric matrix. [Conclusions] CUDA implementation of LDLT can give reference to high performance numerical algorithm research and implementation for sparse matrix on GPU-based heterogeneous platforms.

Key words: LDLT decomposition, right-looking algorithm, GPU, dynamic parallelism