Characterization of Soft Error Rate Against Memory Elements Spacing and Clock Skew in a Logic with Triple Modular Redundancy in a 65nm Process

Sandeep Miryala
Fermi National Accelerator Labotraory
Pine Street, Batavia, IL-60510, U.S.A
E-mail: smiryala@fnal.gov

Tomasz Hemperek
University of Bonn
Nussalle 12, Bonn, Germany
E-mail: hemperek@physik.uni-bonn.de

Mohsine Menouni
Aix Marseille, CNRS/IN2P3, CPPM
163, avenue de Luminy 13288
Marseille, France
E-mail: menouni@cppm.in2p3.fr

Single Event Effects introduce soft errors in ASICs. Design methodologies like Triple Modular Redundancy (TMR) with clock skew insertion, a system level redundancy technique is a common practice by designers to mitigate soft errors. However, the optimal spacing between memory elements in a TMR in 65nm process hasn't been addressed so far. RD53SEU is a mini ASIC development under the framework of the CERN RD53 collaboration to characterize the soft error rates against the separation spacing and clock skew between memory elements in a TMR. This article describes the architecture and design aspects of the various test structures on the RD53SEU test chip.
1. Introduction

Single Event Effects (SEEs) are very common in ASICs developed for detector electronics as they are exposed to energetic ionizing particles from the particle collisions. SEEs in the form of Single Event Upsets (SEUs) and Single Event Transients (SETs) manifest themselves as bit flips in sequential elements and glitches in combinational gates respectively.

In a joint effort between Atlas and CMS groups for the RD53B pixel chip: a readout chip for the inner tracker silicon pixel detector, the estimated bit flips due to SEUs in global configuration registers is one-bit flip per ~20 seconds per chip, whereas in pixel registers this is ~60 bit flips per second per pixel per chip [1] [2]. Hence SEE tolerant design is unavoidable for RD53B at pixel configuration registers, global configuration registers and state machines in the digital chip bottom. Triple Modular Redundancy (TMR) with clock skew is a system level redundancy technique to counter single event effects.

Before resorting to TMR, the common design related questions must be addressed. One of them is choosing the spacing between memory elements in a TMR. Characterizing soft error rate as a function of memory spacing in a TMR helps designers to choose optimal spacing for their applications. An optimal spacing not only determines the final area of the design but also reduces the multiple bit upset effects in a TMR. A similar question arises with the latch based TMR too. Study of soft error rates on a latch based TMR too is also crucial as they are often used in pixel configuration registers in a pixelated integrated circuit. On the other hand, SETs introduces unintentional glitches and the range of glitch width is not clearly understood. It is also understood that by introducing clock skew between memory elements in a TMR logic, designers can mitigate single event transients. However, clock skewing affects the timing margin of the designs and the optimal clock skew to mitigate SETs hasn’t been addressed in available literature. These issues can be addressed with test structures implemented on the RD53SEU test chip. The various test structures designed to address designers concern and their implementation is summarized in this article.

2. TMR Versions

There are several versions of Triple Modular Redundancy (TMR) in the literature and the ones that are considered on this test chip is shown in Figure 1.

2.1 Triple Modular Redundancy without correction

Figure 1.a shows a simple triplication scheme along with a majority voter. Such a version is often used in the state machines of the read out integrated circuits or in the counters in the pixels having limited area.

2.2 Triple Modular Redundancy with correction

In Figure 1.b, Along with the triplication of memory elements with a majority voter, we also have a correction mechanism. The correction happens by feeding back the corrected value to the registers through a multiplexer. Such a TMR version is often used in global configuration registers. An upset in any of the register will be self corrected in the next clock cycle.

2.3 Triple Modular Redundancy with clock skew insertion

In Figure 1.c, we have additional delay insertion as delay 1 and delay 2 to clock lines of memory elements. Any single event transient that arises in the combinational data path will be filtered out at the flip flops.

In general depending on the available power budget, timing slack and area, the designer can chose any of the TMR versions.
Figure 1: Various TMR versions a. TMR with out correction b. TMR with correction c. TMR with clock skew insertion

3. Shift register based implementation

The implementation is based on a shift register approach (Figure 2), where a known test pattern is loaded serially through Shift_In. After ion beam exposure for a certain duration, the irradiated registers is read out serially through Shift_Out pin. The loading into, and reading from the shift register is controlled by Shift_En signal. The size of the shift register is 1kb, and each cell has been implemented in the different versions as dicussed in Section 2. In total, we have 18 shift register based test structures. They all differ in TMR version, TMR memory element type and spacing.

![Shift Register Implementation](image)

Figure 2: Shift register implementation and associated signals

3.1 Spacing distance between TMR Memory Elements, TMR Memory Elements and Delay values

Triplication is effective only if one of the memory elements in a TMR is affected by the single event upset. This can only be guaranteed with certain spacing between the memory elements in the layout. A larger separation is preferable to counter multi bit upset from a single particle strike. However, the separation distance is dictated by the available area and timing constraints of the design. We have considered several spacing options with an objective to quantify soft error rate against memory element spacing in a TMR. Test structures with TMR element spacing of 5 µm, 10 µm, 15 µm are designed.

The memory elements often used in any readout integrated circuit are flip-flops and latches. We have two flavours of flip-flops, standard D flip-flop and D Flip-Flop with asynchronous reset. We have also considered two flavours of latches: a standard latch and a custom latch.

We have designed test structures with three different combinations of delay insertion in a TMR based on Figure 1.c.
4. Single Event Transients (SETs) Pulse Width Measurement

We have designed two test structures in order to characterize the cross section (rate) and the widths of random SET pulses [3]. Both of them are based on a design containing two main blocks:

a. SET Target Combinational Logic Block
b. SET Analyzer Block

The designs are based on the assumption that the range of SET pulse width is typically from 50 ps to 800 ps.

4.1 Design Based on the trigger capture

This test structure captures the SET pulse generated in the target block. The capture block is composed of series of inverter-latch combinations. Each unit delay is ~40ps and the trigger is generated after the 40th stage as can be seen in Figure 3.a. This structure is more suitable for characterization of SET testing with a laser beam. The main drawback is, it can be sensitive to SEU because of a large number of latches and a flip-flop.

4.2 Design Based on the temporal filtering

This test structure consists of a target circuit and an analyzer circuit as shown in Figure 3.b. The analyzer circuit is composed of eight stages with a series of pulse filter flip-flops. The pulse filter is based on delay elements with different values.

![Figure 3: SET pulse measurement test structures](image)

5. Control Signals and Implementation

The associated control signals for the shift register based test structure is shown in Figure 4. The CLOCK, SHIFTEN, LOAD, SHIFTIN and READBACK are the primary inputs for the 18 different shift register based test structures. The inputs and outputs to these test structures are demultiplexed and multiplexed to meet IO limitations on the chip. The test pattern will be loaded serially into these test structures, which are then exposed to ion beam for a certain duration. The irradiated test structure is read out serially. A counter in the DAQ system analyzes the read pattern to count the number of bit flips arising from single event effects. LOAD and READBACK are the input signals explicitly used in latch based test structures. The SET measurement block has dedicated primary inputs and outputs.

5.1 TMR based digital design flow

The triplication scheme is carried out, constraining the spacing of memory elements and clock skew insertion is implemented with a conventional digital design flow, using available
commercial tools in the high energy physics community. As can be seen in Figure 4, triplication is carried out during synthesis itself, where as the spacial separation between the TMR memory elements is applied as constraints during placement stage and clock skew or delay insertion is carried out during clock tree synthesis phase of the physical design.

6. Conclusion

The test structures designed to characterize soft error rate against memory elements spacing and clock skew in a logic with TMR on the RD53SEU test chip has been described in this article. The RD53SEU test chip is a 2mm x 2mm ASIC realized in a 65nm process. The TMR implementation in a digital logic is based on conventional digital design flow. The board design for testing the chip is underway and various radiation facilities are looked into to carry out all the needed tests.

![Figure 4: Control signals and TMR based digital flow](image)

7. Acknowledgements

This work has been done in the framework of CERN RD53 collaboration. The authors would like to thank Jorgen Christiansen (jorgen.christiansen@cern.ch), Maurice G Scrivere (maurice@lbl.gov), James Hoff (jimhoff@fnal.gov) and Grzegorz Deptuch (deptuch@fnal.gov) for the constructive comments while working on this chip.

This manuscript has been authored by Fermi Research Alliance, LLC under Contract No. DE-AC02-07CH11359 with the U.S. Department of Energy, Office of Science, Office of High Energy Physics. The U.S. Government retains and the publisher, by accepting the article for publication, acknowledges that the U.S. Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for U.S. Government purposes.

References