Clock and Trigger Distribution for ALICE Using the CRU FPGA Card

Jozsef Imrek*
Hungarian Academy of Sciences, Wigner Research Center for Physics (HU)
E-mail: jozsef.imrek@cern.ch
for the ALICE Collaboration

ALICE is preparing a major upgrade for 2021. Subdetectors upgrading their DAQ electronics will use a common hardware to receive physics data: the Common Readout Unit (CRU). The same CRU will also distribute the LHC clock and trigger to many of the upgrading subdetectors (to 7800 front end cards).

Requirements are strict: for the clock the allowed jitter (RMS) is typically <300ps, and <20ps for timing critical subdetectors; the allowed variation of skew is typically <1ns, and <100ps for timing critical subdetectors. A constant latency for distributing the trigger is essential.

A novel approach to implement clock forwarding – using only the internal PLLs of the CRU’s onboard FPGA, without using an external jitter cleaner PLL – is presented.

Topical Workshop on Electronics for Particle Physics
11 - 14 September 2017
Santa Cruz, California

*Speaker.

© Copyright owned by the author(s) under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (CC BY-NC-ND 4.0).
1. ALICE Run3 DAQ Upgrade

ALICE is an experiment at the CERN LHC, dedicated particularly to study the properties of Quark Gluon Plasma. ALICE is preparing for a major upgrade, and in Run3 (starting from 2021) it will collect data with several upgraded sub-detector systems. The Data Acquisition (DAQ) System will also be upgraded [1], detectors will be read out using the Common Readout Unit (CRU) [2], a PCIe readout card which is based on an Altera Arria 10 GX 1150 FPGA, and which supports up to 48 optical links, and a gen3 ×16 PCIe interface for host PC connectivity. The CRU will also distribute the LHC reference clock and ALICE trigger information to most of the upgrading frontend cards.

2. CRU and Clock Distribution using PON and GBT

Large physics experiments often comprise a high number of readout channels and frontend cards, where operation of the frontend cards are synchronized using a common reference clock (usually derived from the experiment’s beam clock). This reference clock is typically distributed to the frontend cards by a hierarchical, multi-level tree structure, often using optical interconnects.

Passive Optical Network (PON) is a commercial point-to-multipoint architecture, fitting very well this tree-like structure. It was embraced by CERN to be used in the Run3 upgrade to distribute Timing, Trigger, and Control (TTC) information, see [5] and [6].

This TTC-PON technology was selected by ALICE for Run3 to distribute reference clock and trigger between the Central Trigger Processor (CTP) and the CRU cards; then, in turn, GigaBit Transceiver (GBT) [3] protocol is used to forward the clock and trigger from CRU to the frontend cards. See [7] for details about the multi-level PON architecture used by ALICE.

3. Clock Forwarding with External Jitter Cleaner PLL

At the intermediate level nodes of the clock distribution tree it is sometimes necessary to improve the quality (typically the jitter) of the incoming reference clock before it is forwarded to the next level. This is typically achieved using adedicates external component, a Phase Locked Loop (PLL). The motivation is typically twofold: physics requirements, and system requirements.

Physics requirements come from the physics goals of the detectors: a given physics measurement might require certain timing resolution, which, in turn, puts constraints on its reference clock, for example in ALICE the TOF frontend needs a reference clock with a jitter less than 20 ps (RMS).

System requirements come from the overall system architecture, for example to ensure stable clock recovery, error-free data transmission; this can affect component selection: in our case the initial analysis of the recovered TTC-PON clock in the CRU card showed, that the external jitter cleaner PLL originally selected for the CRU card (Si5338) does not produce a clock clean enough, therefore it was replaced with one (Si534x) providing better jitter attenuation (see Fig. 1).

4. Operation without External Jitter Cleaner

Using an external jitter cleaner PLL comes at a cost. It adds (even if only slightly) to the complexity of the system and the readout card, also taking up PCB real estate.
Figure 1: Phase noise for TTC-PON recovered clock in Altera A10 GX devkit (red), after jitter cleaners (blue and green), and requirements of the FPGA transceiver (black, specified by the manufacturer) – lower is better. The Si5338 PLL does not provide enough jitter attenuation, while Si5344 meets requirements.

It also introduces additional buffers in the clock path (when leaving and entering the FPGA, and the external PLL), which add delays prone to Process, Voltage and Temperature (PVT) variations. The uncompensated delays erode the timing budget when crossing between clock domains on the two sides of the buffers, up to the point where Static Timing Analysis (STA) is not enough anymore to implement a fixed latency data path (essential for trigger distribution), and servo mechanisms must be used to automatically adjust buffer delays on the fly - adding system complexity.

Finally, an external jitter cleaner in the middle of the clock distribution chain might have little positive effect on the quality of the final reference clock, and thus it might not be essential to meet system requirements, if the amount of jitter seen at the leaf of the clock distribution is determined mostly by the quality (jitter transfer and jitter generation) of the last PLL in the chain.

Figure 2: Clock forwarding in the ALICE CRU card using an external jitter cleaner PLL (yellow path), or using only PLLs internal to the FPGA (light blue path). The simplified block diagram shows the major FPGA firmware building blocks in play (TTC-PON and GBT receiver and transmitter, detector specific data processing User Logic), and the interconnecting clock and data signals (color-coded by clock domains).
Therefore we propose a clock forwarding implementation in FPGA, which does not use external PLLs, only the PLLs available internally inside the FPGA (see Fig. 2).

5. Measurement Results

To experimentally test the clock quality without using an external PLL we created a CRU FPGA firmware, which used only FPGA internal PLLs to process the forwarded reference clock. Another firmware, which used an external Si5345 jitter cleaner (mounted on the PLL manufacturer’s evaluation board), served as a baseline for the measurement.

We measured the total jitter at different points along the clock distribution chain (see Fig. 2) by taking the standard deviation (measured by an oscilloscope) of the time interval between the rising edges of clocks: at the output of the external jitter cleaner (when present), at the transceiver output of the FPGA (where bunches of 1s and 0s were transmitted to create a 40 MHz clock), and at an ELink clock output of a Versatile Link Demo Board (VLDB) [4] connected to the CRU card. The measurement results are summarized in Table 1: the transceiver output (column XCVR) has higher jitter, but the jitter on the ELink clock output (the figure-of-merit for the detectors, as this is their reference clock) shows no significant penalty for the internal PLL based firmware.

<table>
<thead>
<tr>
<th>PLL</th>
<th>PLL bw [Hz]</th>
<th>histo entries [k]</th>
<th>cleaned clk [ps]</th>
<th>XCVR [ps]</th>
<th>ELink clk [ps]</th>
</tr>
</thead>
<tbody>
<tr>
<td>external</td>
<td>1000</td>
<td>255.99</td>
<td>3.765</td>
<td>1.407</td>
<td>8.871</td>
</tr>
<tr>
<td>external</td>
<td>100</td>
<td>383.99</td>
<td>3.772</td>
<td>1.379</td>
<td>8.808</td>
</tr>
<tr>
<td>external</td>
<td>10</td>
<td>383.99</td>
<td>3.766</td>
<td>1.418</td>
<td>8.959</td>
</tr>
<tr>
<td>external</td>
<td>1</td>
<td>319.99</td>
<td>3.751</td>
<td>1.413</td>
<td>8.851</td>
</tr>
<tr>
<td>external</td>
<td>0.1</td>
<td>447.99</td>
<td>3.774</td>
<td>1.413</td>
<td>8.925</td>
</tr>
<tr>
<td>internal</td>
<td>“low”</td>
<td></td>
<td>-</td>
<td>1.694</td>
<td>8.830</td>
</tr>
</tbody>
</table>

Table 1: Measured ELink clock jitter, with external PLL (at different loop bandwidth settings) and without external PLL – final figure-of-merit is the ELink clock jitter, shown in bold.

We also measured the skew between the 24 ELink outputs of the internal PLL based FPGA firmware. Transceiver outputs were connected to two VLDBs, one serving as a fix reference (GBT link #0), another cycling through all other links (GBT link #1-#23). The same one-shot pulse was sent as data on all GBT links simultaneously, and the delay between rising edges on the VLDBs’ ELink0 data output were statistically measured using an oscilloscope. The results are shown in Fig. 3: the lane-to-lane skew meets the specification of max 500 ps.

Even though we did not meet the FPGA vendor’s requirements (see Fig. 1) on the reference clock of the transceivers, the GBT links operated without bit errors. It is important to note, that the requirements are specified to cover even the highest bit rates the transceivers support (10 Gbit/s). However, in ALICE they operate at GBT line rate (only 4.8 Gbit/s), where the Unit Interval (UI) is much larger, therefore the actual jitter requirements on the reference clock are much more relaxed.

6. Summary and Future Work

We proposed that using an external jitter cleaner PLL to improve clock quality in clock forwarding applications is – while typical – not always essential. Using jitter, skew and Bit Error Rate
(BER) measurements we showed for the case of the ALICE CRU card, that an implementation using only PLLs inside the card’s FPGA can meet physics and system requirements.

Future phase noise measurements will allow us to better characterize how the total jitter and jitter spectrum are affected by each system component; while more BER measurements with a variable optical attenuator will give us better estimates on the margins we have in the optical budget.

**Acknowledgments**

We would like to thank Eduardo B. De Souza Mendes from the CERN EP-ESE-BE Group, and to Marian Krivda and Roman Lietava from the ALICE Trigger Group for their input to this work. This work was supported by the Hungarian OTKA research grant K120660.

**References**


