Long-term Storage Achieves of IHEP: From CASTOR to EOSCTA
Q. Yao*, Y. Bi and Y. Cheng
Published on: September 28, 2022
CASTOR is the primary tape storage system of CERN, it has been used for over fifteen years in IHEP. By 2022, the data volume has reached 12PB from the various experiments.More experiments in IHEP, such as JUNO, CEPC, HEPS require long-term storage and to handle the quickly increasement of data, we plan to replace tape storage system from CASTOR to EOSCTA. From 2021, new data of LHAASO has been saved gradually in EOSCTA. In 2022, BES online data and JUNO raw data will be saved directly in EOSCTA.

In this paper, we describe the current infrastructure of EOSCTA in IHEP. We set up two EOS instances, which are served for four experiments, to support for multiple online file systems (LUSTRE and EOS). According to the different of data generation, we design different workflows to receive data from remote experimental stations or local disk arrays to EOSCTA.

CASTOR will be replaced by EOSCTA and all existing data of CASTOR will be migrated to EOSCTA. We also will update the generation of tapes, from LTO 4 to LTO 7, include five CASTOR instance, two types of tape library. It is planned to complete the most migration by 2023.The paper reports the migration plan, the steps and methods of data migration, and the inspection after ensuring the data integrity.

Based on the previous experience of EOSCTA, we present the outlook in requirement of experiments in IHEP, discuss the possible way to use EOSCTA to achieve mass data storage from different data sources.
DOI: https://doi.org/10.22323/1.415.0007
