Citation: | Xue Xiaoliang, Su Haibing, Shu Huailiang, et al. Research on fault injection system of FPGA in irradiation environment[J]. Opto-Electronic Engineering, 2019, 46(12): 180549. doi: 10.12086/oee.2019.180549 |
[1] | 兰风宇. Xilinx Virtex-7 FPGA软错误减缓技术研究[D].哈尔滨: 哈尔滨工业大学, 2016. Lan F Y. Soft error mitigation techniques for Xilinx Virtex-7 FPGA[D]. Harbin: Harbin University of Technology, 2016. |
[2] | 王忠明. SRAM型FPGA的单粒子效应评估技术研究[D].北京: 清华大学, 2011. Wang Z M. Techniques for evaluating single-event effect in SRAM-based FPGAs[D]. Beijing: Tsinghua University, 2011. |
[3] | Xilinx Inc. Device reliability report[R]. UG116(v10.8), 2017. |
[4] | Xilinx Inc. LogiCORE IP soft error mitigation controller[R]. PG036(v4.1), 2017. |
[5] | Hussein J, Swift G. Mitigating single-event upsets[R]. WP395(v1.1), Xilinx Inc., 2015. |
[6] | Gong L K, Wu T, Nguyen N T H, et al. A Programmable Configuration Controller for fault-tolerant applications[C]//Proceedings of 2016 International Conference on Field-Programmable Technology, 2016: 117–124. |
[7] | Xilinx Inc. Virtex-5 FPGA configuration user guide[R]. UG191(v3.12), Xilinx Inc., 2017. |
[8] | Soni R K. Open-source bitstream generation for FPGAs[D]. Blacksburg, Virginia: Virginia Polytechnic Institute and State University, 2013. |
[9] | Le R. Soft error mitigation using prioritized essential bits[R]. XAPP538(v1.0), Xilinx Inc., 2012. |
[10] | Chapman K. SEU strategies for virtex-5 devices[R]. XAPP864(v2.0), Xilinx Inc., 2010. |
[11] | Xilinx Inc. Virtex-5 libraries guide for HDL designs[R]. UG621(v14.7), Xilinx Inc., 2013. |
[12] | Nunes J L, Cunha J C, Barbosa R, et al. Evaluating Xilinx SEU Controller Macro for fault injection[C]//Proceedings of the 43rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2013: 1–2. |
Overview: SRAM FPGAs have attracted increasing attentions in aerospace applications due to their low cost, rich logic resources, and reconfigurability. However, SRAM cells are highly susceptible to the effects of radiations, manifested as single event upsets (SEU), thus hindering the applicability of FPGA in the aerospace field. The configuration RAM (CRAM) is the largest number of memory cells in FPGA chip. Considering the direct impact of CRAM on the user circuit logic, the research object of this paper is CRAM. In order to test the failure rate of CRAM in radiation environment, the FPGA needs to be irradiated under the accelerator beam, which can simulate the space environment more realistically. However, it is expensive and the test period is long. Therefore, the artificially designed fault injection system to simulate the SEU can quickly and inexpensively test the reliability of the design on the FPGA. Injecting fault into CRAM can be achieved through the external interface (JTAG or SelectMAP) or the internal interface (internal configuration access port, ICAP). For internal fault injection, most designs use on-board processors. Starting from Virtex-6/Spartan-6, Xilinx provides a PicoPlaze-based SEM (soft error mitigation) IP core, which can implement fault injection, fault repair, fault classification, and other functions. Since the PicoPlaze processor does not have an official C compiler and the instruction space is extremely small (1024 words), the SEM controller cannot be flexibly reprogrammed to design a different fault repair mechanism. The Virtex-5 series FPGAs studied in this paper do not have a dedicated SEM IP core which is officially provided, so a self-designed fault injection system is required. This article studied the frame structure of Xilinx FPGA CRAM, giving the method of extracting the frame structure and providing the order of frames in the bit stream file. The structure of the intermediate file of SEM IP core is also analyzed to get the positions of essential bits. Performing 0/1 flipping on the essential bits is a way to simulate the SEU problem. A PC-side interface is designed to implement a human-machine interaction. The fault injection system is implemented on FPGA chip, and the read and write of the CRAM data are realized through ICAP without the need of the processor. The fault injection system is placed on resources that are not used by the circuit under test, occupying about one percent of the FPGA resources, which greatly saves resource overhead. The operation of flipping and repairing test classifies essential bits into the following categories: the non-critical and repairable, the non-critical and unrepairable, the critical and repairable, the critical and unrepairable, and the residual bits that affect other non-masked bits in the same frame. The classification results can be used to protect key bits in subsequent fault repairing. In addition, a fault injection test on the triple modular redundancy (TMR) circuit is performed to verify the effectiveness of TMR for SEU protection. For TMR circuit, the proportion of its key bits will be greatly reduced but not to zero, which indicates that the TMR can reduce failure rate caused by SEU but cannot completely avoid this fault. Since TMR cannot eliminate the accumulation of SEU faults, it is necessary to supplement other fault-tolerant measures such as internal scrubbing and external scrubbing in practical engineering applications.
The process of parsing frame structure
Distributions of the fault injection system and DUT-related essential bits on FPGA (a) by Matlab, (b) under PlanAhead tool, (c) under FPGA editor tool
The architecture diagram of the fault injection system
Block diagram of the FPGA-side fault injection system
Block diagram of the PC-side fault injection system
The interface of PC-side fault injection system