Efficient Transformers for Charged Particle Tracking at the HL-LHC
Z. Wolffs*,
V. Pshenov and
S. Caron*: corresponding author
Abstract
The High-Luminosity LHC (HL-LHC) will operate with unprecedented detector occupancies, raising a significant computational challenge in the tracking of charged particles. Transformer architectures may provide a solution in this challenge due to their ability to learn global relationships and their excellent parallel scaling on GPU hardware. However, the quadratic cost of the self-attention mechanism prevents straightforward application to events containing the hit multiplicities expected at the HL-LHC. This article presents an efficient transformer model for track pattern recognition that overcomes this limitation by introducing a geometry-driven sparse attention scheme based on FlexAttention. Locality is imposed by clustering hits in projected detector surfaces, reducing the number of computed attention interactions by more than two orders of magnitude. Evaluated on the pixel-detector portion of the TrackML dataset, the model reaches a double-majority efficiency of 90\% for tracks with $p_{\mathrm{T}}>0.9$~GeV, while achieving per-event inference latencies of under 100~ms on an NVIDIA H100 GPU. These results demonstrate that transformer architectures equipped with sparse locality-aware attention are a promising direction for fast, scalable charged particle tracking at the HL-LHC.
How to cite
Metadata are provided both in
article format (very
similar to INSPIRE)
as this helps creating very compact bibliographies which
can be beneficial to authors and readers, and in
proceeding format which
is more detailed and complete.