稀疏对称矩阵的LDLT分解在GPU上的高效实现
陈鑫峰,王武

An Effective Implementation of LDLT Decomposition of Sparse Symmetric Matrix on GPU
Chen Xinfeng,Wang Wu
伪代码3. 当前列的多线程分解
1: function factorize(Lp, Li, Lx, tmpMem,
tmpMem1, d, n, k, col)
2: id=blockIdx.x*blockDim.x+threadIdx.x;
3: if id < Lp[k+1]-Lp[k]-1 then
4: tmpMem1[n*col+Li[Lp[k]+1+id]] =
Lx[Lp[k]+1+id]; // L(i,k)*L(k,k)
5: Lx[Lp[k]+1+id] = cuCdivf(
Lx[Lp[k]+1+id], d); //A为复数矩阵
6: Lx[Lp[k]+1+id ] /= d; //A为实数矩阵
7: tmpMem[n*col+Li[Lp[k]+1+id]] =
Lx[Lp[k]+1+id]; // L(i,k)
8: end if
9: end function