Development of a Scout Job Framework for Improving Efficiency of Analysis Jobs in Belle II Distributed Computing System
September 28, 2022
At the Belle II experiment, massive collision data is collected to broadly advance our understanding of particle physics. To store and process the data, a worldwide distributed computing system is utilized. This system is also used for data analysis. It is important to suppress failed jobs, which is one of the causes that prevent the efficient physics analysis in this system. Because these failed jobs are originated from problems in the analysis script or improper settings of the job parameters, we developed a syntax checker and a scout job framework. The former detects syntax errors at the language level of analysis scripts and notifies the end-user before the job submission. The latter submits a small number of scout jobs with the same analysis script to process a small number of events prior to massive job submissions to detect complicated syntax errors or improper settings of job parameters. Only if the scout jobs succeed, the main jobs are submitted. As a result of the introduction of these two features, we can reduce system troubles and waste of computational resources due to failed jobs. This is also beneficial for end-users because the pre-test can be streamlined.
How to cite
Metadata are provided both in "article" format (very similar to INSPIRE) as this helps creating
very compact bibliographies which can be beneficial to authors and
readers, and in "proceeding" format
which is more detailed and complete.