Navigating, flourishing, and distinguishing out in the world of smart and connected devices has gotten increasingly difficult. These smart devices are becoming increasingly complicated, software-oriented and linked, and they are expanding outside traditional product boundaries. These gadgets must evolve and adapt to new trends, standards, and security requirements on a regular basis. It is critical to take a more strategic approach to over-the-air upgrades in order to keep these smart devices running smoothly and securely. Over-the-Air (OTA) update technology is one of the most extensively used ways for upgrading software and achieving good customer experience.
This article will explore OTA design challenges of an embedded IoT system in detail and how to tackle them in product design. We will also be explaining some tips and tricks to offer complete powerful OTA features for developing intelligent IoT solutions at a much faster rate.
While OTA opens the opportunity to remotely update firmware, it is currently beset by a number of challenges; Figure 1 below shows these primary challenges. Let us first understand these design challenges and then we will explore how to overcome these challenges while designing an embedded IoT solution.
Challenges of designing a secure field-upgradable system
Embedded devices are prime targets for attacks, as a successful attack can give intruders access to the data produced, received, and processed by them. This can often have serious ramifications for the larger system being powered by the embedded device. E.g. shutting down entire connected home or controlling the car that you drive, accessing all the personal data from your gadgets, etc. A compromised system is a huge threat as it may have one’s personal data or data of organization that may pose serious threat to life, reputation and finances. The list of potential OTA system vulnerabilities is extensive: Man-in-the-Middle (MITM) attacks, malware introduction, outdated software, lack of encryption, physical access to the devices, are all possibilities. As a result, attackers may alter a legitimate update, inject malicious code, take control of a device, and reprogram crucial embedded system components.
2. Context awareness:
Software should have the intelligence to NOT to perform the updates while in the middle of any critical task. It should have built-in context awareness and provide best safe experience to the user. For example, if the firmware of a smartwatch is upgraded while the user is hiking, it may be a critical risk. Any failure during the update process may hinder the user experience
3. Robustness / Power failure and recovery:
The device that is being updated may reboot in the middle of the OTA process due to several reasons including user-forced shutdowns, discharged or dead battery situations, and so on. Warning users to not to force shutdown and checking the battery charge level before starting the updates are very primitive measures, which may not be sufficient to have a robust system. Design should take care of the risk of power outages in the middle of an update, and apply countermeasures so these scenarios do not result in the device being bricked.
4. End user Experience:
The final user experience is determined by a number of aspects such as update time, frequency, and type of updates. In order to provide a good user experience, the update time should be as low as possible. Too many updates can be a bothersome, while not enough updates can lead to low consumer engagement. While automated updates are a nice idea, they may not always function. For example, customers may prefer to use earlier UIs and features even if the manufacturer wishes to implement a big modification in the UI of a fitness tracker.
As the number of connected devices rises, OTA systems must be capable of adapting to changing needs and informing a small number of devices to millions of devices. These devices could be used anywhere in the world.
For example, a fitness tracker band with the same brand and model may be available all over the world. This also adds to the connectivity problems, as devices may not always be connected to the network. It is essential to roll out an update to all devices in order to provide a favorable consumer experience.
6. Size of the firmware:
The size of the firmware grows in tandem with the number of components and computing power on the edge devices. Programmable memory in modern microcontrollers ranges from a few hundred KBs to several MBs.
Upgrading the complete firmware at once may be a difficult operation because it necessitates downloading several MBs of the image, storing, verifying, and installing it. The entire process may have an impact on the device’s uptime, battery life, and, most importantly, the user experience.
7. Version management:
Managing the firmware version that would potentially be sent to millions of devices is a huge task. Firmware updates are unlikely to arrive all at once. Instead, enhancements and fixes will be released in stages. Distinct versions of hardware, as well as potentially separate versions of software, may exist on different end user devices. It can be difficult to keep track of and allow the smooth distribution of firmware without hindering the
Tackling OTA challenges in a real embedded IoT system
Typical OTA flow
The firmware that has to be upgraded is signed and kept on the host / cloud. This host / cloud downloads the firmware image to the device that has to be upgraded; the device authorizes and validates the firmware by using pre-defined mechanisms. This flow is explained in Figure 2 below.
Figure 2: Typical OTA Application Flow
Let’s learn how to overcome the challenges discussed above by picking some of the real world embedded IoT application examples.
1. Security is the key to product success: The security journey begins with the conceptualization and early design of the product. To ensure embedded system security, hardware, software, and cloud providers must collaborate. For example, device boot integrity check, chain of trust, and strong key management and encryption are computationally intensive for embedded software alone. Hardware and software design must collaborate here to provide the optimal solution with reasonable performance. The OS or middleware can provide the functionality such as access control policies, encrypted file systems, rootless execution, path space control,and thread-level anomaly detection by relying on hardware capabilities. For example, the PSoC™ 64 device portfolio for PSA Certified security MCU with TF-M support and secured over-the-air updates. The sections below outline the security measures that should be pursued for secured OTA.
a. Protect data at rest and transit: All sensitive information, configuration data, security keys, and passwords that are stored on an embedded device must be protected. This is usually accomplished by data encryption. The private keys required to encrypt the data must be kept in dedicated security hardware, such as a Secure Element. Cryptographic processes, such as signature verification and authentication are conducted within in the Secure Element, and only the results of these operations are accessible to other components and applications. Secret keys are protected from being extracted from the Secure Element. There are lots of security solutions available for designers to provide end-to-end security for their applications. For example, OPTIGA™ Trust M device provides end-to-end security solutions for building secured IoT applications. Additionally, data received from any external sources should be validated, before being passed to critical components.
b. Digital signing: Digital signatures check the authenticity of the image and ensures the integrity of the data in the image. Without a signature, there’s no way to verify that the image came from an authentic source, and it could’ve come from a malicious third-party. A digital signature also checks if the data within the image has not been modified (preserving integrity) and is intact as it was generated at the source/author. Each firmware update must be digitally signed by the vendor, and the digital signature must be verified by the IoT device before starting the update. Complete, hardware cryptography-enabled microcontrollers should be provided to address security challenges for the IoT.
c. Select a security MCU: Select an MCU that supports rich security features, cryptography, and secured device provisioning services. Use the Isolated secure processing environments (SPE) and non-secure processing environment (NSPE) to reduce the attack surface, by minimizing the untrusted or unverified part of firmware. One solution consists in isolating the execution and the resources of the different processes either by selecting the processors with built-in hardware isolation (like M33) or dual-core CPUs with a dedicated CPU for security. Use Trusted Execution Environment (TEE) for hardware-level isolation of security critical operations such as in the PSoC™ 64 security microcontroller that implements a dedicated security core to provide resource isolation.
d. Build chain of Trust (secured boot): A chain-of-trust is a series of actions that happen when the device is booting to check if each component is trusted starting from the device power on. For example, parts for the PSo™C 64 security MCU are sent from the factory with a hardware-based Root-of-Trust (RoT) to verify that the MCU came from a protected source. The RoT is then passed on to a developer, who installs a secured bootloader and security policies on the device. The RoT verifies the integrity and validity of the bootloader during the boot phase, which then verifies the integrity and authenticity of any second stage bootloader or software, which then verifies the authenticity and integrity of the application. Following that, the application validates the validity and integrity of its data, keys, operating settings, and so on.
e. Encrypt your firmware updates: Encrypting the firmware helps to lowers the risk of reverse engineering or easily being analyzed by an adversary. Device can accept the updates only from a trusted source, decrypt the image and deploy it. This is a crucial security requirement in wireless over-the-air updates since the data is susceptible to communication channel sniffing. Along with the security measurement to protect the data, communication channel also should be protected by appropriate protection technology applicable / feasible for the technology.
3. Build intelligent and context aware solutions: OTA systems must be aware of the context before performing an update. Software should have the ability to monitor the user behavior, and understand the context in which users are before deciding to deploying an update.
For example, a smart watch should be able to detect the user activities like jogging and hiking and defer the firmware update to a future time to avoid any possible loss of activity data during update. Each user will have their own routine, which the system software should be able to identify and initiate upgrades during the least active hours for a better experience.
While automatic updates are a great idea, that do not work in all situations. Never force an obligatory update on clients unless it is a critical security patch. For example, an update may include a significant modification to UI, but some customers may want to use the existing version of the UI they are comfortable with. Always inform users about the nature of the update before deploying it and allow them to accept, postpone, or reject the updates.
4 . Fail-safe boot: In the Internet of Things, it’s more important than ever to make sure you can keep your things up to date quickly and flawlessly. While updating the firmware, care must be taken to not to break the device functionality. Also, it is important that the firmware update strategy is well planned to recover from errors and accidental power failure scenarios. Though the devices such as fitness trackers are battery-operated, battery discharge curve is not always linear and hence checking the status of the battery before the update alone is not sufficient. Error recovery mechanism must be built into the bootloader to provide a fail-safe on-field solution.
A defective update triggered by a power outage or a network outage can brick a device, necessitating a full recovery. However, if it is done atomically with a rollback to the last working version, an unanticipated occurrence will not result in the device bricking.
5. End user Experience: It is widely accepted that the end user experience determines the value of a brand. One of the most important variables in establishing the user experience is the amount of time necessary for the OTA. The time required for a firmware update is determined by the size of the firmware, the speed of the interfaces used to download the firmware, the storage technology utilized, and the security countermeasures implemented. Optimize firmware update time by selecting the appropriate security measures and optimizing the firmware size. You might even wish to divide the firmware into logical chunks and fix them separately as needed.
The quicker the firmware update, the better the customer experience. In-order to enhance the customer experience, it might be essential to restore the user configurations after firmware update to that of before. Software may want to have such intelligence built into store the device configurations and user data such as network configuration and key feature configuration, that can be read by future version of an upcoming update.
6. Size of the update: Solution must be designed in such a way that each component of the system may be updated independently rather than upgrading the entire firmware at once. Sometimes it might not be really necessary to upgrade all the components in the system. For example, you may want to separate the Wi-Fi firmware, Bluetooth® stack, etc. from the main firmware and update them independently only when required.
Because OTA is a critical component, it should be well tested to cover all possible use case scenarios to avoid on-field failure. Alternatively, you may select the easily available vendor-provided OTA framework such as Cloud-agnostic OTA framework from Infineon on GitHub.
An IoT product might use several microcontrollers and sensors originating from different vendors. Each of these vendors release bug fixes or updates to their respective software to improve their product on need basis. A well-designed OTA architecture must have the ability to update all such firmware independently.
7. Version management: One of the important security measures required for OTA capable IoT devices is the “limited” or “no” permissions to downgrade the firmware. A versioning system must be put in place to safeguard the device from firmware downgrades. Furthermore, there may be numerous models of hardware deployed in the field, all of which use the same firmware or a flavor of the firmware originated from the common source. The OTA system should be able to manage such complexities and keep track of all potential hardware and software combinations in the field. Also, versioning system shall be able to differentiate the devices that require an updated and the devices that don’t require an update and manage accordingly.
OTA updates are a critical infrastructure component to nearly every embedded IoT device. Sure, there are systems out there that once deployed will never update; however, those are probably a small fraction. OTA updates are the go-to mechanism to update firmware in the field.
We’ve examined several best practices that developers and companies should consider when they start to design their connected systems. In fact, the bonus best practice for today is that if you are building a connected device, make sure you explore your OTA update solution sooner rather than later in your design cycle. Otherwise, you may find that building that chain-of-trust necessary in today’s deployments will be far more expensive and time-consuming to implement. Also remember to test your OTA solution regressively for all the possible combinations on the field. Stay tuned to part 2 as we cover connectivity and security requirements for a real application like smart home markets along with a solution that integrates all the requirements including user interface, sensing, connectivity, and security.
Jaya Kathuria Bindra works as Director, Applications Engineering at Infineon Technologies where she is managing the Embedded Applications and Solutions Development group using the PSoC and WiFi/BT platform. She has 18+ years of experience in the Semiconductor Industry. She earned her MBA credential from IIM, Bangalore and holds a bachelor’s degree in Electronics Engineering from Kurukshetra University. Jaya can be reached at [email protected]
Ravikiran HV is a Sr Staff Applications Engineer at Infineon Technologies, where he is working in Embedded Applications group. He has over 11+ years of domain experience. He holds bachelor’s degree in Electronics and Communication Engineering from VTU. Ravikiran can be reached at [email protected]