

# A high speed transmitter circuit for the ATLAS/CMS HL-LHC pixel readout chip

T. Wang\*, T. Hemperek, H. Krüger, K. Moustakas, P. Rymaszewski, and M. Vogt

University of Bonn Nussallee 12, Bonn, Germany E-mail: t.wang@physik.uni-bonn.de

In order to satisfy the high output bandwidth requirement imposed by the High Luminosity LHC, a high speed transmitter circuit was designed and integrated into the RD53A demonstrator chip for the phase 2 ATLAS/CMS pixel detector upgrade. A clock and data recovery circuit recovers clock from the 160 Mb/s data stream received by the chip, and provides the high speed clock to the serializer, where the 1.28 Gb/s output stream is formed from the 20-bit data words provided by the data encoding logic. The output stage employs a three-tap current-mode logic cable driver with adjustable tap weights for optimal pre-emphasis in order to compensate for the high frequency loss of the foreseen low mass cable. Each RD53A chip includes four output data lines, offering in total 5.12 Gb/s output bandwidth. The RD53A chip has been fabricated in a 65 nm CMOS technology. The output jitter was measured to be  $\sim 20$  ps (1  $\sigma$ ) with pseudo random data at the nominal speed of 1.28 Gb/s.

Topical Workshop on Electronics for Particle Physics (TWEPP2018) 17-21 September 2018 Antwerp, Belgium

#### \*Speaker.

<sup>©</sup> Copyright owned by the author(s) under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (CC BY-NC-ND 4.0).

## 1. Introduction

The operation of the High Luminosity Large Hadron Collider (HL-LHC) foreseen around 2026 presents major challenges for hybrid pixel read-out Integrated Circuits (ICs) on several fronts. One of them is a much higher output bandwidth as compared to the pixel read-out ICs currently being used at LHC, e.g. FE-I4 [1] and PSI46dig [2]. This is due to a much higher trigger rate up to 1 - 4 MHz, and the larger number of pixel hits associated with 200 proton-proton interactions per beam crossing at HL-LHC. Therefore, the older generation of output data links that operate at 160 Mb/s must be replaced with a new design that provides multi-Gb/s read-out. Additionally, the unprecedented radiation environment comes together with the high beam luminosity will require the read-out ICs to withstand significantly higher levels of radiation, i.e. 10 MGy in 10 years for the innermost pixel layer.

Within the RD53 collaboration [3], a high speed transmitter was integrated into the RD53A [4] demonstrator chip for the phase 2 ATLAS/CMS pixel detector upgrade at HL-LHC. By combing four output data lines, each operating at 1.28 Gb/s, the maximum output data rate of the RD53A chip is 5.12 Gb/s in order to cope with the hit rates in the innermost tracking layer.

The RD53A demonstrator chip has been fabricated in a 65 nm CMOS technology. This paper focuses on the design and characterization of its high speed transmitter. In section 2, the architecture of the data link implemented in the RD53A chip is briefly introduced and the circuit level design of the transmitter will be described. Section 3 shows the first measurement results. The work is concluded in section 4.

# 2. Circuit description



Figure 1: Block diagram of the data link for the RD53A chip.

The data link architecture of the RD53A chip is shown in Figure 1. The chip receives the command data stream (CMD in Figure 1) running with a custom DC-balanced protocol at 160 Mb/s, from which the clock is recovered by a clock and data recovery (CDR) circuit. The clock recovery data path is shown in Figure 1, where the clock produced by the Phase Locked Loop (PLL) is phase-aligned to the transitions of the input data. The PLL is composed of a phase detector (PD), a charge pump (CP), a loop filter (LF), a voltage controlled oscillator (VCO) based on ring oscillator, and a frequency divider (FD). The loop bandwidth of the PLL circuit was chosen to be  $\sim 15$  MHz, as a trade-off between noise sensitivity and loop stability. The VCO oscillates at 1.28 GHz when the loop is locked. Several clocks running at fractional rates of the 1.28 GHz clock can be obtained at the outputs of FD. In the nominal operation mode, the 1.28 GHz clock is used for the transmitter clock SER\_CLK, while lower speed clocks from the FD as well as an external clock EXT\_SER\_CLK can also be used. The transmitter includes four parallel data lanes, where a data lane refers to a single serial link. Each data lane has a serializer, which receives 20-bit data words from the data encoding logic and produces an 1.28 Gb/s output data stream driven off chip by a cable driver. The chip can be configured to use one, two or four output lanes. The circuit design of PLL and cable driver is derived from the data transmitter of the DHP chip developed for the DEPFET pixel detector at Belle-II [5].



Figure 2: Circuit diagram for the serializer with the timing diagram shown on the upper right.

The circuit implementation of the serializer is shown in Figure 2. It employs a double data rate structure, thus is driven by the half-rate clock SER\_CLK\_DIV2. The 20-bit data word is loaded into the serializer periodically when the LOAD signal is valid. The odd bits of the data word are loaded into the upper chain of flip-flops, whereas the even bits are loaded into the lower flip-flop chain. The data is then shifted towards the output where a 2:1 MUX controlled by the SER\_CLK\_DIV2 clock selects alternatively the even and odd bit streams as the output.



Figure 3: Circuit diagram for the output stage of the transmitter.

The output stage of the transmitter employs the current mode logic (CML) driver, which consists of a common-source differential pair steering the tail current. The driver is back-terminated by 50  $\Omega$  resistors to minimize the back reflection. In order to compensate for the high frequency signal loss over a long ( $\geq 1$  m) low mass cable envisaged for the real experiments, a three-tap pre-emphasis was implemented, which will work together with the equalization circuit at the receiving end to reverse the low pass effects over the transmission media. As shown in Figure 3, a tap configuration block generates three bit streams from the input data, separated in time by one unit interval. The pre-driver generates the differential gate voltages for the three-tap CML driver. The weight of each tap is defined by the tail current of the corresponding driver, and can be adjusted individually by the on-chip 10-bit DACs. The currents of all the three taps are summed at the output nodes. The driver for each tap is identical, and the maximum current is  $\sim 15$  mA. The activation and polarity of the two later taps (TAP1 and TAP2 in Figure 3) can be fully controlled, offering the flexibility to achieve different pre-emphasis configurations. In addition to the serializer output SER\_OUT, the transmitter data can be selected from the half-rate clock SER\_CLK\_DIV2, the pseudorandom binary sequence (PRBS) generated by the 7-bit Linear Feedback Shift Registers (LFSR7), and constant level "0" for debugging and testing purpose.

#### 3. Measurement results

The chip was tested using the BDAQ53 data acquisition system [6]. Figure 4 shows the measurement setup, including a custom PCB carrying the RD53A chip connected to a data acquisition board built upon a commercially available FPGA module. The chip was configured to use one data lane, and the signal at the driver output was measured by a 12.5 GHz differential probe and an oscilloscope with bandwidth of 8 GHz.

> Piff. probe FPGA RD53A<sup>mo</sup>

Figure 4: The data acquisition board (left) and the single chip carrier PCB with the RD53A chip (right).

Figure 5 shows the measured eye diagram and time interval error (TIE) with the 1.28 Gb/s PRBS signal from LFSR7. With the 1.28 GHz clock from the CDR circuit, the measured jitter in terms of the standard deviation of TIE is  $\sim 20$  ps, mostly contributed from the recovered clock including the jitter transferred from the input command stream. In order to verify that the driver doesn't introduce significant jitter, a low jitter (< 2 ps) 1.28 GHz external clock from a pattern generator was used in a second test. The output jitter was then reduced to 5.8 ps, and the random jitter component is equivalent to that of the external input clock. In order to demonstrate the functionality of the pre-emphasis, Figure 6 shows a measured output waveform conveying random bit stream at 1.28 Gb/s with 2-tap pre-emphasis enabled. The four-level behavior of the waveform manifests the pre-emphasis.

## 4. Conclusion

The paper presents the high speed transmitter circuit implemented in the RD53A demonstrator chip for the ATLAS/CMS pixel detector at HL-LHC, which can offer a maximum output bandwidth of 5.12 Gb/s combining four output data lanes. First measurements showed full functionality of the circuit, and the output jitter in terms of the standard deviation of TIE is  $\sim 20$  ps with 1.28 Gb/s PRBS signal, which is mostly contributed from the recovered clock. The jitter can be largely



**Figure 5:** Measured eye diagrams (left) and time interval error (right) with 1.28 Gb/s PRBS signal from LFSR7, using the internal clock from CDR. The pre-emphasis is disabled in this measurement.



**Figure 6:** The measured waveform at the driver output conveying random bit stream at 1.28 Gb/s, using 2-tap pre-emphasis. The CML DAC settings are given in the legend.

reduced if a low jitter external clock is used, with the random component less than 2 ps, demonstrating good performance of the cable driver. Next steps include measuring the circuit with cable prototypes foreseen in the real experiments once they become available, and characterizing the chip samples irradiated by X-rays up to 10 MGy. The PLL circuit has already been seen to be functional up to high radiation levels (5 MGy), but issues of link stability occurred when having combined effects of radiation, low temperature and low supply voltage. A test chip with improved CDR design has been submitted in order to have more margins for link stability and to reduce the clock jitter. The new design will be qualified before being included in the final ATLAS/CMS pixel chips.

## References

- M. Garcia-Sciveres et al., *The FE-I4 pixel readout integrated circuit*, *Nucl. Instrum. Meth. A* 636 (2011)
- [2] D. Hits and A. Starodumov, *The CMS Pixel Readout Chip for the Phase 1 Upgrade*, *JINST* 10 (2015) C05029
- [3] J. Chistiansen and M. Garcia-Sciveres, *RD Collaboration proposal: Development of pixel readout integrated circuits for extreme rate and radiation, CERN-LHCC-2013-008. LHCC-P-006* (2013)
- [4] RD53 Collaboration, The RD53A Integrated Circuit, CERN-RD53-PUB-17-001 (2017)
- [5] T. Kishishita et al., Prototype of a gigabit data transmitter in 65nm CMOS for DEPFET pixel detectors at Belle-II, Nucl. Instrum. Meth. A **718** (2013)
- [6] M. Vogt et al., Characterization and Verification Environment for the RD53A Pixel Readout Chip in 65 nm CMOS, PoS(TWEPP-17) 084 (2017)