By navigating our site, you agree to allow us to use cookies, in accordance with our Privacy Policy.

High Reliable And Performance Deep Learning Accelerator for ADAS and Autonomous Driving Systems

Next-generation ADAS and autonomous driving (AD) systems, when deployed to market, will require accurate and highspeed recognition, judgment, and operation.

Katsushige Matsubara
Katsushige Matsubara, Sr. Manager, Automotive Solution Business Unit Renesas Electronics Corporation

Renesas presented these achievements at International SolidState Circuits Conference 2021 (ISSCC 2021), which take place February 13 to 22, 2021. We will continue to develop and deploy in-vehicle LSI based on this technology. We expect these will contribute to the realization of a safe and secure car society through the spread of ADAS and AD systems.

Convolutional neural networks (CNNs) require large amounts of computation for pattern recognition. As the number of sensors installed increases, higher CNN performance is required. However, as power consumption increases in proportion to performance, a heavy and expensive water-cooling system is needed. It is required to achieve both high deep learning performance and low power consumption that enables a lightweight and cost-effective air-cooling system. Achieving a CNN performance of 60TOPS with an efficiency of 10TOPS/W per one LSI device is the optimal target from a practical point of view.

CNNCNN accelerator with high performance and power efficiency

A CNN accelerator (CNNA) performance/efficiency target is to achieve 60TOPS performance with 10TOPS/W efficiency. From an implementation point of view, it is realized with three identical accelerators instead of one accelerator. One CNNA contains 13,824 MAC arithmetic units and operates at 800MHz.

The theoretical maximum performance of the three CNNAs is 66TOPS. In addition, each CNNA connects 2MB dedicated scratchpad memory (SPM) through a 512-bit interconnect module. This increases the execution efficiency of CNNA, reduces the amount of data transferred between CNNA and external memory (DRAM) by about 90%, and saves the power consumed by the DRAM interface and interconnect. From the actual measurement of test chip, VGG16 has 32TOPS performance with 6.1TOPS/W efficiency, and CNNAoptimized network (Network-A) has 60.6TOPS performance with 13.8TOPS/W efficiency.

EvaluationSafety mechanism for ASIL D tasks

Next-generation ADAS and AD systems are required to achieve the functional safety of ASIL D, which is the strictest safety level of ISO 26262. The dual core lockstep (DCLS) is one of the methods that can satisfy the metric of ASIL D. Fault can be detected by performing the same process on two redundant hardware and comparing their respective outputs.

CNNA also requires hardware redundancy to meet the ASIL D metrics but simply applying DCLS requires a large MAC compute unit to be redundant. It is not practical because area and power consumption increase significantly. To achieve ASIL D metrics without adding redundant hardware, two CNNAs (CNNA1 and CNNA2) are dynamically configured by software to perform lockstep operation during processing that require safety.

CNNA is used for both image recognition processing (ASIL B) input from the camera and modeling of the surrounding environment from the results input from each sensor (ASIL D). But most of the execution time is the former ASIL B image recognition processing. Therefore, by switching CNNA1 and CNNA2 to lockstep operation only during surrounding environment modeling processing, ASIL D tasks can be achieved without significantly compromising performance or power efficiency.CNNA

The following is the lockstep operation of CNNA using lockstep DMAC (LDMAC).

1) LDMAC loads the same data from DRAM into SPM1 and SPM2.

2) CNNA1 and CNNA2 perform the same network processing.

3) LDMAC reads the execution results from SPM1 and SPM2 and compares them. If they do not match, it is judged as fault. Only the result of CNNA1 is stored in

Another important factor in achieving ASIL D is freedom from interference (FFI). There are a mix of tasks with different ASILs in the system. They must not interfere to higher ASIL tasks. As mentioned earlier, CNNA is accessed by tasks at different ASIL levels, so the memory space used by each task must be separate.FFI

The mechanism for memory space isolation is implemented in CNNA, LDMAC, and the memory protection tables of the memory management unit (MMU). The context index of the currently running task is given to the transaction output from CNNA and LDMAC. The MMU receives it and switches the context on a transaction-by-transaction basis.Ldmca


BiS Team

BIS Infotech is a vivid one stop online source protracting all the exclusive affairs of the Consumer and Business Technology. We have well accomplished on delivering expert views, reviews, and stories empowering millions with impartial and nonpareil opinions. Technology has become an inexorable part of our daily lifestyle and with BIS Infotech expertise, millions of intriguers everyday are finding for itself a crony hangout zone.

Related Articles

Upcoming Events