

# Hardware production quality control for the ATLAS Phase-I readout upgrade

### Fabrizio Alfonsi\* on behalf of the ATLAS TDAQ Collaboration

University of Bologna and INFN

E-mail: fabrizio.alfonsi@cern.ch

The off-detector data aquisition system upgrade of the LHC ATLAS experiment at CERN is based on the Front-End LInk eXchange framework. As part of this upgrade, approximately 120 custom PCIe cards are being produced by an industrial partner, based on a hardware design developed within the ATLAS collaboration. This production requires detailed Quality Assurance and Quality Control procedures to ensure the hardware being produced is fully functional and robust for many years.

Topical Workshop on Electronics for Particle Physics TWEPP2019 2-6 September 2019 Santiago de Compostela - Spain

\*Speaker.

## 1. Introduction

The Large Hadron Collider (LHC) has been successfully delivering proton-proton collision data at the unprecedented center of mass energy of 13 TeV with an instantaneous luminosity already above design. Part of the next road map will be the increasing of performance in terms of luminosity and the implementation of important upgrades on both LHC and experiments. ATLAS [1] is performing several upgrades on the detector, the trigger and the data acquisition system for the data taking in Run 3 (after the Phase-I upgrade [2], starting in 2021) and Run 4 (Phase-II upgrade [3], starting in 2026).

One of the most relevant upgrades in ATLAS Phase-I will be the implementation of the New Small Wheel (NSW) subdetector, as part of the Muon Spectrometer, and a new set of off-detector FLX-712 DAQ (Data AQuisition) cards developed by the FELIX collaboration [4]. These are currently under production and commissioning and will be used to read out the NSW.

## 2. FELIX Architecture

The FELIX (Front-End Link eXchange) [5] architecture is based on the idea of reading out the detector with a dedicated common readout system, integrated in commercial off-the-shelf servers. Figure 1 shows a scheme of the DAQ infrastructure, which will be used to read out several upgraded ATLAS subsystems after the Long Shutdown 2. This architecture has been developed to achieve the trigger and the data recording from up to three times the nominal LHC instantaneous luminosity. The scheme shows FELIX cards handling the busy signal from the detector and the Trigger Timing Controller (TTC); the high throughput inputs/outputs manage:

- the communication with the front-end (FE) chips; this will be made by the FELIX cards using the GBT protocol (so communicating with a GBTx chip, as the example described in the scheme in Figure 1) or with the FullMode protocol;
- the 40 Gb/s data transfer with a switched Ethernet or InfiniBand network (made by the DAQ PC), making available the PC-based organization of the FE configuration, calibration and many other operations.

The FELIX card and DAQ PC talk to each other with the PCI Express Gen3 system. Figure 2 shows the FELIX FLX-712 board developed for use in ATLAS already in Run 3. The data acquisition card is based on 48 links tested at 9.6 Gb/s using the high-speed serializer-deserializers available on the on-board FPGA. A 24 link version of the card exists as well. The FLX-712 uses the two data protocols cited above: the GBT (bandwidth: 4.8 Gb/s) and the FELIX custom FullMode (bandwidth: 9.6 Gb/s). An ADN2814 chip filters the clock from data and a Si5345 jitter cleaner handles the TTC input and the high-quality clock for the high-speed transmission in the FPGA.

#### 3. Test Plan

To ensure the quality of the FLX-712 cards after the production a list of checks and tests was prepared. This test suite includes standard industrial tests and specific cross-checks prepared by the



Figure 1: FELIX infrastructure [5].



**Figure 2:** FLX-712 card. The board is based on a Xilinx XCKU115 FPGA [6] [7] (1) and is composed of a 16 stackup PCB (7). FLX-712 features a PCI Express Gen3 with 16 lanes (3) managed in a 8x2 structure with a PCIe switch (6) and an opto-electrical converter. The conversion is performed by eight devices called miniPODs (2), four in transmission and four in reception. The miniPODs gather together 12 links each and are connected to the output MTP coupler device (4). In the end a TTC (Trigger Timing and Controller) card (5) copes with the ATLAS global clock cleaning and asserts the Busy signal from the DAQ PC in case of back-pressure.

FELIX collaboration. This plan has been developed following the high-quality standard required for manufacturing the PCB and the high-connectivity and performance of the card components. Further the Quality Assurance and Quality Control (QA/QC) is performed to verify the firmware stability, the functionality of the communication between the Linux software tool commands and the cards through PCI Express.

## 4. Functionality Test Setup

The motherboard used for the functionality tests executed after the mechanical assembly is a SUPERMICRO X10DRG-Q [8] that features 5 PCI Express (PCIe) Gen3x16 lanes connections and 2 CPU slots populated with 2 Intel Xeon E5-2660 v3 at 2.60 GHz. This configuration allows to manage 4 FLX-712 cards concurrently on the PCIe bus. The operating system used is Scientific Linux CERN 6. The system was set up by the author with the support of the Bologna group. The required scripts to run all the tests were developed following the FELIX team specifications, using FPGA tools to stress the performance and FELIX software scripts to check the card functionalities.

## 5. Test Description

Industrial specifications concerning the PCB quality standard requested by CERN, for example the NADCAP (National Aerospace and Defense Contractors Accreditation Program), had to be fullfilled for the FLX-712 production. In addition, before populating the PCB, the quality and the electrical I/O ports of all the components were fully checked and tested. Other checks were performed on the cards after the soldering operations to verify the connections and the robustness to mechanical stresses. In particular we have paid attention to the X-rays controls of the biggest BGA pads, for example the FPGA ones. After the cards passed the industrial tests and were completely assembled, they were sent to CERN to start the qualification and final acceptance tests. These tests included:

- a further visual check on the cards to make sure their integrity was acceptable;
- Bit Error Ratio (BER) tests on the transceivers side of the FLX-712s. An error bit ratio < 10<sup>-13</sup> using a Pseudo Random Binary Sequence was obtained. Figure 3 shows the typical eye diagram that passed the acceptance criteria, based on the shape and on the value of the open area of the eye (well open eye shape and an open area calculated by the tool used (Vivado) > 6300);
- general functioning tests of all the monitoring operations as initialization of the components, etc... and of the minimum accepted performance as GBT and FullMode throughput;
- the jitter measurements on several clocks nets in the board to obtain the maximum accepted value of 10 ps;
- the propagation time of L1A trigger signals from the FLX712s to the emulated front-end, this implemented in different devices: the accepted value had to be constant around 500 ns;

- a long-term test of at least 8 hours to stress the stability of the overall data acquisition flow;
- a check of the busy signal from the FLX-712s (in case of the system is busy and can't acquire other data) and the slow control to the simulated front-end;
- two thermal cycles from 0 to 100 °C on a small sample of cards not powered, done in a thermal chamber.



**Figure 3:** One of the many eye diagrams studied. The units (Voltage and Unit Interval) represent respectively the Voltage and Time units (not in a linear correlation), while the BER color legend is referred to the color map of the diagram that shows the "distance" from the Voltage offset calculated in case of ideal behavior.

Figure 4 is a picture of one of the setups used during these tests.



**Figure 4:** Test setup for fully assembled cards. This image shows the FLX-712 (2), the JTAG connection to communicate with the card (1), the motherboard (3) on the PC, the loopback optical cable for data transmission and reception (4) and the 4U PC case (5).

# 6. Summary and Plans for the Future

On December 2018 the Long Shutdown 2 started and the ATLAS Phase-I upgrade began. The FLX-712 electronic boards, developed within the FELIX collaboration, are under production and will be used for some upgraded subdetector readouts as the New Small Wheel in Run 3. For this purpose a new batch of FLX-712 cards was produced and fully validated. All the tests were passed by 18 cards, proving the stability of the hardware design, which was the goal of these tests. The two defective cards, failing in one sub-part of the functioning tests, are going now through further controls. Some minor modifications will be added to the board design for the next batch of production to avoid possible critical conditions, for example the ones due to the too short length of the on-board optical fiber. Some further firmware stability issues occurred during the long-term tests. All these issues are solved now, demonstrating the efficiency of the firmware and of the hardware and software environment. 20 cards were produced as a pre-series and the rest of the production of 100 cards is still ongoing. The validated cards were already delivered to the ATLAS subdetectors for commissioning.

#### References

- [1] ATLAS Collaboration, JINST 3 (2008), S08003.
- [2] ATLAS Collaboration, ATLAS TDAQ System Phase-I Upgrade Technical Design Report, CERN-LHCC-2013-018, http://cds.cern.ch/record/1602235.
- [3] ATLAS Collaboration, Technical Design Report for the Phase II Upgrade of the ATLAS Trigger and Data Acquisition System, CERN-LHCC-2017-020, http://cds.cern.ch/record/2285584.
- [4] FELIX ATLAS CERN website, https://atlas-project-felix.web.cern.ch/atlas-project-felix/.
- [5] J. Anderson et al, FELIX: a High-Throughput Network Approach for Interfacing to Front End Electronics for ATLAS Upgrade, 2015, J. Phys.: Conf. Ser. 664 082050.
- [6] Xilinx, UltraScale Architecture GTH Transceivers, http://www.xilinx.com/support/documentation/user guides/ ug576-ultrascale-gth-transceivers.pdf.
- [7] Xilinx, Kintex UltraScale FPGAs Data Sheet: DC and AC Switching Characteristics, DS892.
- [8] SUPERMICRO, SUPERMICRO X10DRG-Q User Manual, MNL1677.