Another IoT Security "Uh-Oh": 26 Flaws in Open-Source Zephyr and MCUboot Stacks

By Brandon Lewis

Editor-in-Chief

Embedded Computing Design

June 18, 2020

Story

Another IoT Security "Uh-Oh": 26 Flaws in Open-Source Zephyr and MCUboot Stacks

Jennifer Fernick, Head of Research at NCC, spoke with Embedded Computing Design about 26 vulnerabilities found in the Zephyr RTOS and MCUboot bootloader during a recent security analysis.

The open-source Zephyr RTOS and MCUboot projects have been growing in popularity for both suppliers and developers in the IoT ecosystem, as together they include all of the drivers, libraries, stacks, and file systems required to develop a full-blown application suitable for tiny edge devices. It also doesn't hurt that they're free.

Like with any open-source technology, maintaining and improving the Zephyr and MCUboot codebases fall to the community,  which many equate to  better security. Indeed, a recent Forbes article covering the Zephyr project stated "open-source software is generally deemed more secure, as anyone can inspect and debug the code."

I guess the operative word in that sentence is "generally."

NCC Group, one of the world's largest security consulting firms, recently conducted an independent research examination that analyzed the security posture of both the Zephyr RTOS and MCUboot secure bootloader, and found more 26 vulnerabilities of varying severity.

Jennifer Fernick, Head of Research at NCC, spoke with Embedded Computing Design about findings from the company's "Research Report – Zephyr and MCUboot Security Assessment."

What critical vulnerabilities were found in your examination of the Zephyr RTOS and MCUboot secure bootloader, and, more importantly, what are their implications on the security of IoT systems?

FERNICK: The research in this report focused on the Zephyr RTOS and the MCUboot bootloader, where the researchers uncovered 25 vulnerabilities in Zephyr and 1 vulnerability in MCUboot. We know that the Zephyr RTOS holds about 3 percent of Internet of Things market share, and that this influence is growing rapidly as a result of the support of many chipset manufacturers including Intel, NXP, Nordic Semiconductor, and others. 

The critical vulnerabilities were both in the Zephyr network stack, which included both a stack buffer overflow attack, as well as a memory corruption vulnerability. Through exploitation of this specific buffer overflow, an attacker may cause a denial of service or gain code execution within the device kernel when a malicious ICMP packet is received on devices that enable the specific build options, while through exploitation of this specific memory corruption vulnerability a remote adversary can send an MQTT packet with a malformed header in order to induce memory corruption within the Zephyr kernel, possibly leading to code execution.

The high-risk vulnerabilities were both data validation bugs in the Zephyr USB stack, and included a global buffer overflow attack in the driver used for firmware updates over USB, as well as arbitrary read/limited write in the USB Mass Storage Driver. Through exploitation of these USB vulnerabilities, an adversary with physical access to a Zephyr device can induce a denial of service or possibly achieve code execution within the kernel. Denial of service vulnerabilities could mean that IoT devices deployed in remote locations may require physically visiting a device to perform a manual reboot before the device will be operational again, and whereby kernel-level code execution allows the attacker to compromise and take complete control of the device, running arbitrary code, undermining functionality of the device, and which could potentially even include pivoting to attack other devices on the same network, or to attempt to achieve a foothold or persistence on the associated network. 

How difficult is it to remedy these Zephyr vulnerabilities? Let's focus specifically those in the USB stack?

FERNICK: NCC Group reported five vulnerabilities in the Zephyr USB stack. The highest risk vulnerabilities are that the USB DFU Mode Can Overflow a Global Buffer in the DFU_Upload Command and the Arbitrary Read and Limited Write in the USB Mass Storage Driver.

The USB DFU Mode Global Buffer Overflow compromise involves exploitation of a buffer overflow vulnerability present in Zephyr’s USB DFU driver, which is typically used for local firmware update over USB. Through this vulnerability, an adversary with physical access to the device is able to induce, at minimum, a denial of service within the device, and in some cases even achieve code execution within the kernel, by inserting a malicious payload into internal flash memory via the USB DFU interface and then triggering a global buffer overflow. However, it should be noted that the exploitability of this depends on the memory layout of the specific Zephyr build. This issue has been fixed, and can be remedied through introducing basic checks on the sizes in specific input variables to mitigate buffer overflow.

In researching the USB Mass Storage Driver, our team found that there was an issue in the interaction between the USB Mass Storage Driver – which is used to enable the Zephyr device to act as a USB storage drive – and the RAM storage, whereby a base address that is greater than the total size of the RAM disk will lead the USB driver to error in way that results in a malicious disk read query being able to read memory past the end of a global buffer, which can disclose kernel memory contents and enable an attacker to obtain code execution within the kernel. This issue has been fixed, and can be remedied through introducing basic checks on the sizes in specific input variables to mitigate buffer overflow, and ensuring that bounds-checking is performed in a way that will not be unintentionally stripped from production builds.

A medium-risk vulnerability is Out-Of-Bounds Write in the USB Mass Storage memoryWrite Handler  with Unaligned Sizes. The finding in the memoryWrite handler was that the page array could be overwritten during  the copy of USB transfer data from buf to page when USB packet and storage block sizes are misaligned, which, depending upon the layout/order of specific global variables, could in some cases be exploitable. This issue has been fixed, and could be fixed by increasing the size of the page buffer to meet some minimum threshold, and by, upon write completion, moving any remaining data to the beginning of the buffer. 

Another medium-risk vulnerability was Integer Underflow in USB Mass Storage Driver Write and Verify Handlers. This vulnerability is related to the memoryWrite and memoryVerify functions in Zephyr, where input sanitization is improperly implemented, resulting in an integer overflow. This vulnerability enables an attacker to either leak stack memory contents, or to corrupt global kernel memory, based on the size value selected, but is only exploitable for specific memory layouts since the attacker does not directly control the requisite parts of the stack buffer. This issue has been fixed, in a manner similar to the other input sanitization checks discussed previously. 

Finally, a low-risk vulnerability is that the USB DFU Mode Allows Reading out the Primary Slot Bypassing Image Encryption. This finding means that encrypted firmware images can be decrypted when the (optional) USB DFU mode is enabled, by reading the plaintext firmware image out of the primary image slot using an upload command when both the Zephyr USB DFU and MCUboot encrypted image features are enabled. This attack requires physical access to the device. This issue has not yet been addressed, but could be improved for users by either offering an option to disable the DFU_Upload command in Zephyr – which is what allows reading out of the firmware image at all –  or by simply clarifying in the MCUboot documentation that the bypassing of firmware image encryption is one of the “attack vector[s] that enable[s] dumping the internal flash in any way,” against which the documentation cautions MCUboot’s threat model does not cover. 

How much should MCUboot be able to cover for potential compromises in Zephyr, and what are the implications of the vulnerability found within the open-source secure bootloader?

FERNICK: The robustness of the boot chain implementation plays a substantial role in the security of an embedded OS. In the case of Zephyr and MCUboot, MCUboot performs most of the boot-time firmware integrity verification checks, although Zephyr does retain some responsibility for firmware upgrades performed at runtime. In addition, some aspects of chip configuration that are required for secure boot assurance are outside the scope of both MCUboot and Zephyr, and are instead the responsibility of the device OEM during manufacturing. Some examples of this include the disabling of JTAG or SWD to prevent runtime debugging, or enabling flash read protection to prevent extraction of device secrets. Consequently, a secure boot chain is necessary but not sufficient for a high degree of security assurance of an embedded system running Zephyr or any other RTOS. For that reason, stakeholders and components at many levels and from across the supply chain must uphold specific responsibilities to ensure the desired security properties are achieved.

The MCUboot vulnerability, a potential access of an uninitialized variable in the serial boot process, means that due to flaws in an input processing function in MCUboot, the function could potentially make use of a variable that has not been initialized first, and if the value of this variable is very large or very small, it could write into memory or integer underflow/overflow, eventually resulting in memory corruption when decoded bytes are written into the output buffer.

So what does this mean for Zephyr/MCUboot users? Should they be avoided? Is there a quick fix? Something else?

FERNICK: Having undergone an initial security review, users can feel a level of increased confidence in the security of Zephyr and MCUboot relative to comparable systems whose internals have gone unexamined. However, I would caution that the desired scope of coverage for the review prioritized some of the system components hypothesized to be of highest security risk while leaving others unexamined, and is therefore not complete. Further work would be required for a full-scope security audit. 

The researchers did find some further opportunities for kernel hardening, as well as evidence across a number of improper syscall validations, that kernel/user isolation is not necessarily robust on a system-wide scale. 

Improper data validation is a common vulnerability type in insecure code, regardless of ecosystem – failure to validate user-generated input data, for example, is the root cause of SQL injection attacks and cross-site scripting, which are ubiquitous enough to be part of the OWASP Top Ten. Of course, IoT devices as a category tend to be anecdotally known for frequently containing a higher number of vulnerabilities that would typically be seen in other types of systems, including but certainly not limited to data validation vulnerabilities. 

In general, users of IoT devices are more secure when the designers and manufacturers of embedded devices and associated components used to build the IoT device take security seriously, as evidenced by things like having a vulnerability disclosure program, enabling product security by default, and other design and operational commitments such as those recently outlined in the ioXt Pledge.

The Zephyr project’s response to the disclosed vulnerabilities has been very good, where the highest risk components and several others have been patched, and the Zephyr project appears keen to educate their users about our findings and about the security of the project in general. After we disclosed our findings to the Zephyr team, they created a Product Creators Vulnerability Alert Registry, to help them better connect to their customers that use Zephyr in their products. A commitment to continually improving security – and to transparent communication around mitigating security risks in IoT devices – is a great step in the direction toward a more secure world. 

Editor's note: NCC Group's complete Zephyr and MCUboot Security Analysis research report can be downloaded from https://research.nccgroup.com/wp-content/uploads/2020/05/NCC_Group_Zephyr_MCUboot_Research_Report_2020-05-26_v1.0.pdf.

For general information visit https://www.nccgroup.trust/us/.

Jennifer Fernick is a computer science researcher and cybersecurity leader with deep technical expertise in cryptography. She spent nearly five years as a PhD researcher at the University of Waterloo as a member of the Centre for Applied Cryptographic Research. She’s worked on bleeding-edge cryptography research, and previously served as Director of Information Security at a major global financial institution where she ran the cryptography team and helped design parts of systems that protected nearly a trillion dollars in assets.

Brandon is responsible for guiding content strategy, editorial direction, and community engagement across the Embedded Computing Design ecosystem. A 10-year veteran of the electronics media industry, he enjoys covering topics ranging from development kits to cybersecurity and tech business models. Brandon received a BA in English Literature from Arizona State University, where he graduated cum laude. He can be reached at [email protected].

More from Brandon