Matrix-matrix multiplications are common in quantum chemistry calculations, and can benefit enormously from GPU acceleration. Although NVIDIA provides an implementation of the BLAS *GEMM routines with its CUDA distribution, two key problems exist when trying to use these from existing code
- Most GPUs in current use have limited memory available
- Few GPUs have double precision hardware available
Although these problems will not usually be encountered when using research clusters, code running on distributed clients (such as BOINC) cannot assume that a large-memory double precision GPU will be available. The SciGPU-GEMM library was written to alleviate these difficulties. The library contains three principal routines
- dgemm_cleaver
- A
DGEMMimplementation which will split the input matrices into pieces small enough to fit onto the GPU - sgemm_cleaver
- The same, but for
SGEMM - mgemm
- A multi-precision matrix-matrix multiplication routine
The last routine splits matrices into 'small' and 'large' portions. The 'small' portions are handled in single precision on the GPU, while the CPU handles the 'large' portions in double precision. Attached to this story is a tarball of the v0.8 release of SciGPU-GEMM library.
| Attachment | Size |
|---|---|
| scigpugemm0.8.tgz | 70.48 KB |
