稀疏对称矩阵的LDLT分解在GPU上的高效实现
|
陈鑫峰,王武
|
An Effective Implementation of LDLT Decomposition of Sparse Symmetric Matrix on GPU
|
Chen Xinfeng,Wang Wu
|
|
表2 UMFPACK的分解和求解各阶段的时间(ms),以及本文求解器相对UMFPACK的加速比
|
Table 2 Runtime (in ms) of LDLT decomposition and solving phases using UMFPACK solver, and the speedup of our solver compared with UMFPACK
|
|
matrix | symbolic | numeric | solve | T(umf) | Sp(ldlt) | Sp(total) | windscreen | 603.154 | 13968.758 | 94.106 | 14666.018 | 14.342 | 8.056 | crystk03 | 215.40 | 4951.943 | 34.575 | 5201.919 | 7.143 | 4.513 | bcsstk37 | 54.54 | 1447.959 | 30.678 | 1533.177 | 4.224 | 2.763 | bcsstk35 | 223.709 | 1661.172 | 31.031 | 1915.912 | 4.692 | 3.399 | t3dh | 7468.603 | 191216.956 | 443.984 | 199129.543 | 46.227 | 25.245 | TEM152078 | 9288.08 | 200071.001 | 566.479 | 209925.56 | 40.884 | 21.262 | TEM181302 | 11430.526 | 218608.984 | 288.069 | 230327.579 | 35.884 | 18.670 | pwtk | 1171.662 | 21133.883 | 211.081 | 22516.626 | 7.343 | 3.805 | BenElechi1 | 1129.767 | 24923.273 | 211.145 | 26264.185 | 7.805 | 4.309 |
|
|
|