基于FMM-PM方法的宇宙N体模拟在GPU上的实现和优化
扶月月,王武,王乔

The Implementation and Optimization of Cosmological N-Body Simulation by FMM-PM Method on GPUs
Yueyue Fu,Wu Wang,Qiao Wang
图10 纯MPI程序与三个CUDA实现版本程序的P2P函数执行时间对比
Fig.10 Comparison of P2P function execution time between pure MPI program and three CUDA implementation version programs