Developed a CUDA version of the FDTD method and achieved a speedup 40x. Implemented on a NVIDIA Quadro FX 3800 GPU, which has 192 SPs, 1GB global memory, and a memory bandwidth of 51.2 GB/s.
SIAM Journal on Numerical Analysis, Vol. 26, No. 6 (Dec., 1989), pp. 1474-1486 (13 pages) An explicit finite difference algorithm is developed to approximate the solution of a nonlinear and nonlocal ...
Finite-difference approximations for the first derivative, valid halfway between equidistant gridpoints, are in general much more accurate than the corresponding approximations, which are valid at ...