PoS - Proceedings of Science
Volume 430 - The 39th International Symposium on Lattice Field Theory (LATTICE2022) - Software development and Machines
Maximizing the Bang Per Bit
K. Clark*, D. Howarth, J. Tu, M. Wagner and E. Weinberg
Full text: pdf
Pre-published on: January 29, 2023
Published on: April 06, 2023
Reducing memory traffic is critical to accelerate Lattice QCD computations on modern processors, given that such computations are memory-bandwidth bound. A commonly used strategy is mixed-precision solvers, however, these require careful treatment to ensure stable convergence. We give an overview of the strategies employed in QUDA to stabilize mixed-precision variants of Conjugate Gradient (CG), and its multi-shift brethren. Through the use of customized numerical storage formats we can significantly improve upon the precision achievable compared to IEEE numerical formats, increasing both the solver precision and stability achievable at fixed word size. We give examples using BiCGStab(l) and multi-shift CG solvers using the HISQ operator.
DOI: https://doi.org/10.22323/1.430.0338
How to cite

Metadata are provided both in "article" format (very similar to INSPIRE) as this helps creating very compact bibliographies which can be beneficial to authors and readers, and in "proceeding" format which is more detailed and complete.

Open Access
Creative Commons LicenseCopyright owned by the author(s) under the term of the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.