The processing power for Digital Signal Controllers (DSCs) has been evolving to achieve the demanding requirements of today’s real-time control applications. In addition to requiring increased Digital Signal Processing (DSP) performance, embedded applications also require extra Central Processing Unit (CPU) performance to further enhance communications and implement new functional safety and management features. These factors are thus responsible for the way processing power is deployed in High-Integration Microcontrollers and Digital Signal Controllers.
Contemporary digital power supplies are an example of an application that demands stellar DSP performance coupled with high CPU performance to meet stringent system specifications. In this application, the DSC is responsible for precise and efficient control of energy conversion using mathematical algorithms and real-time PulseWidth Modulation (PWM) control. It also requires connectivity to relay real-time operational status and receive commands from a system-level management unit using protocols such as PMBus.
Another application example is an automotive fan or pump controller. While the closed-loop control of the motor leverages the signal processing capability of the DSC, communication with other modules in an automobile for control, status and diagnostics reporting is often accomplished through a protocol such as Controller Area Network Flexible Data (CAN-FD).
Some applications have complex and diverse control requirements such as an air conditioning unit. Today it is practical and cost-effective to control the power-factorcorrecting power supply, a heavy-duty compressor, and the fan motor of the AC unit all from a single DSC. The supply, compressor, and fan each require different real-time critical control algorithms.
A single high-speed CPU core is, in theory, designed to handle low-latency real-time control tasks as well as networking and system management tasks through time-slicing in the execution of several independent threads. However, the ability to achieve high performance in any given process technology comes at the cost of power consumption and software complexity to minimize the impact one thread might have on the performance of another thread running on the same CPU.
This software complexity, where one needs to determine the way threads and interrupt handlers will meet their respective deadlines becomes a challenging problem for the designer.
Conservative approaches maximize the amount of headroom the CPU has left over by leaving a relatively significant portion of processing cycles unallocated to guarantee that a thread can meet its deadlines under all conditions.
We also need to consider the impact of the overhead of frequent task switching on processing throughput. There can be significant overhead due to interrupt handling and the saving and restoring of internal state during task switching that is exacerbated when more and more threads are running simultaneously on a single core.
How do you improve your processing efficiency? The answer is simple, Dual Core Solution!
Imagine if you could run multiple threads without the overhead of having to save and restore the CPU’s state every time the system needed to switch between them.
With a dual-core implementation, the workload of multiple threads can be divided across independent cores thereby minimizing some of the thrashing created by context switches. The inherent parallelism of multiple cores allows lower core clock frequencies to be used for a given level of throughput which could be a better match for flash memory speeds. Thus, we can reduce or eliminate the number of stall cycles (wait states) where the processor needs to wait for instructions or data from the flash memory. And, lastly, running each core slower can improve overall energy efficiency because circuits do not need to be designed to run as fast so they can use less power.
Leveraging multiple cores in an application often results in greater determinism for time-critical tasks and it can simplify development efforts.
In some applications, a single core is more efficient at handling large sets of related data that involve closely coupled tasks. But this is not true in the case where different functions are being executed in a high-performance embedded application. Here, it makes sense and is more efficient to use more than one core since the computational functions are loosely coupled.
Single-core Solutions Vs Dual-core Solutions
Let’s take an example of a power supply, where the closedloop control for the power conversion is implemented in firmware. In this case the quality of the power supply’s output is determined mostly by the latency, or how much time it takes, to convert an analog sample to digital, perform a complex calculation resulting in a new duty cycle from that data, and then update the PWM with that new information. With a multi-core controller, latency-critical functions such as this are not hampered by other system activities since unrelated system tasks can be executed on a another core that doesn’t have these time-critical tasks to perform.
In parallel with the time-critical control loop calculations running on one CPU core, another core can be tasked with other responsibilities such as PMBus communications and system monitoring functions. Similarly, in a motor control application, splitting the closed-loop control processing and the CAN interface stack execution across different cores ensures that the motor’s commutation is precise and deterministic.
From a project development standpoint, overall design time can be reduced when the firmware is partitioned across different CPU cores. Applications are easier to develop and the code for each core can even be created by different design teams residing at different locations. The separately coded functions require less integration then would be necessary if all the code had to interoperate and run on a single CPU core and it is easier to debug on separate cores.
Dual Core Solutions and its Underlying Benefits
Unified Digital Signal Controller architectures such as Microchip’s dsPIC33 core helped solve synchronisation problems by bringing the two types of execution behaviour together into a single architecture. With such a pipeline, one can stream multiply-accumulate and matrix operations at high speeds while still providing fast pointer-chasing and change-of-flow capability. Also, high responsiveness to interrupts can enable real-time reaction and adaptation to changing conditions. However, challenges of code integration exist irrespective of the architecture customers choose. Often, we see development teams split the combination of communication and control functionality in many applications between themselves, based on specialization.
This solution has its own set of challenges. Figuring out scheduling and task prioritization is a key issue while integrating code from two or more teams. While such decisions may seem small, they can have a major impact on the overall real-time behaviour of the application. Knowledgeable engineers are responsible for setting task priorities across two or more processes.
Allocating code to different cores for processing that was developed separately makes it easier to manage and integrate diverse code. It also makes allocating the resources of each core easier such as how best to allocate and use the limited data RAM of each core.
The Advent of dsPIC33CH
Although distributing processing tasks across multiple cores optimises both development effort and processing throughput, Microchip is constantly innovating to help increase overall performance of its controllers. A performance-enhancing example in the dual-core dsPIC33CH is the implementation of context-selected 40-bit accumulators to boost interrupt service routine responsiveness. These accumulators’ contents do not need to be saved to the stack during a context switch thereby saving CPU cycles and reducing the time it takes to perform a context switch. Additional instructions to increase DSP performance are implemented in the new dsPIC33CH core such as data limit instructions that enable single-cycle clamping a value within an upper and lower bound. Instructions are added that allow 32-bit load and stores to the 40-bit accumulators and an accumulator normalization instruction, faster divides and bit field instructions.
The dsPIC33CH DSC is highly integrated with many advanced peripherals such as high-speed ADCs, DACs with waveform generation, analogue comparators with reference DACs, analogue programmable gain amplifiers and high-resolution PWM generators that have resolution down to 250 ps to help rein in system costs and board size.
Peripherals that tend to interrupt the CPU core frequently can be an impediment to overall system performance. To combat this, the dsPIC33CH includes intelligent peripherals and a peripheral trigger generator that can off-load the CPU from needing to respond to as many interrupts. For example, the device’s UARTs can reduce software overhead through hardware support for LIN/J2602, IrDA®, DMX and smart card protocol extensions. High-performance peripherals such as the CAN-FD module include features such as dedicated DMA that allow it to run more autonomously from the CPU core. Microchip’s dsPIC33CH is optimised for high-performance and time-critical, real-world embedded-control applications. With its dual-cores this cutting-edge product family offers designers increased performance while easing software development by enabling the easy partitioning of tasks across the cores. The dsPIC33CH is architected so that engineering teams can “design separately, integrate seamlessly.”