Reducing event and data sizes is critical for experiments at the LHC, where high collision rates
and increased detector granularity rapidly increase storage and processing requirements. In the
CMS experiment, a recent development to address this challenge is the Raw’ format: a new
approach for recording silicon strip data in which only the reconstructed cluster’s barycenter and
averagechargearestored,ratherthantheanalog-to-digitalconvertercountsfromeverystrip. This
format was successfully deployed online during Run-3 for PbPb collisions at CMS, achieving an
event size reduction by nearly a factor of two and enabling CMS to record almost all hadronic
minimum bias PbPb collisions. To further enhance Raw’, we optimized the number of bits
used to encode the cluster barycenter and total charge, using tracking efficiency and resolution
as benchmarks. Comparing standard Raw with Raw’ shows that refining the bit precision yields
strongercompressionwhilemaintainingsimilarperformance. Additionally,weintroducealossless
compression strategy that encodes distances between clusters instead of their absolute positions
within a detector module. Unlike absolute positions, the distribution of these distances is peaked
around zero, effectively reducing entropy of that variable. Consequently, LZMA compression
becomes more efficient, allowing even stronger data reduction than the current Raw’ algorithms
withoutlosinginformationintegrity. Lastly,wediscussprojecteddatasizesforPhase-2andexplore
extending these techniques to other CMS detectors, notably the High-Granularity Calorimeter,
which is anticipatedto generateasubstantial fraction offuturedata.

