海洋环流模式LICOM的GPU实现与优化
张留莹,王鹏飞,张峰,刘海龙,林鹏飞,王涛,韦俊林,田少博,姜金荣,迟学斌

The Implementation and Optimization of LICOM on GPUs
Zhang Liuying,Wang Pengfei,Zhang Feng,Liu Hailong,Lin Pengfei,Wang Tao,Wei Junlin,Tian Shaobo,Jiang Jinrong,Chi Xuebin
表2 CUDA版本的程序片段
Table 2 CUDA version of vinteg function
Example: vinteg.cu
15 i = (blockIdx.x)*blockDim.x + threadIdx.x;
16 j = (blockIdx.y)*blockDim.y + threadIdx.y;
17 if (i < d_imt&&j < d_jmt) {
18 for (k = 0; k < d_km; k++) {
19 d_wk2[j * d_imt + i] += d_dzp[k] *
20 d_ohbu[j * d_imt + i] *
21 d_wk3[k * d_jmt * d_imt + j * d_imt + i] *
22 d_viv[k * d_jmt * d_imt + j * d_imt + i];
23 }
24 }