Trace estimation is central in many lattice QCD computations, but the accuracy of the standard stochastic Hutchinson method improves only with the square root of the sample size, making precise results expensive.
We investigate two complementary variance reduction strategies. First, \emph{multigrid multilevel Monte Carlo} uses a multigrid hierarchy to construct an unbiased multilevel estimator via recursive coarse-grid corrections available from the multigrid solver. Second, \emph{stochastic probing} uses distance-$d$ graph colorings; we propose a torus-based coloring that requires substantially fewer colors than hierarchical probing at the same distance.
We test these approaches on two representative problems: the connected pseudoscalar correlator and disconnected fermion loops. For the connected pseudoscalar two-point function, the multilevel decomposition yields a variance reduction of up to $\mathcal{O}(10^5)$ at large time separations and translates into a clear cost reduction at fixed accuracy, thus confirming earlier results~[1]. For the disconnected loops, in contrast, the multilevel decomposition provides only moderate gains, whereas probing combined with dilution delivers a substantial cost reduction that improves as the number of probing vectors is increased. Overall, the results highlight a pronounced complementarity: deflation schemes are most effective for observables dominated by long-distance propagation, while probing is most effective for localized quantities.
[1] R.~Gruber, T.~Harris, and M.~K.~Marinkovi{\'c}, ``Multigrid Low-Mode Averaging,'' \emph{Phys.\ Rev.\ D} \textbf{111}, 074508 (2025). doi:10.1103/PhysRevD.111.074508.

