MPU Security Part 2: Partitions and Basic Operations

By Ralph Moore

President

Micro Digital

April 22, 2019

Blog

MPU Security Part 2: Partitions and Basic Operations

This is the second part of a four-part series of articles presenting a unique product MPU Plus and a methodology for using the Cortex-M MPU to achieve MCU security.

This is the second part of a four-part series of articles presenting a unique product MPU‑Plus and a methodology for using the Cortex-M Memory Protection Unit (MPU) to achieve improved Microcontroller Unit (MCU) security. Part 1 presented some introductory concepts: MMUs vs MPUs, increasing need for security, protection goals, MPU-Plus snapshot, Cortex‑v7M and v8M, and MPU operation.

Figure 1: Partitions

Figure 1 illustrates the software structure that we are trying to achieve for security. In this diagram, ovals represent isolated partitions. The partitions above the heavy line run in umode (unprivileged or user mode) and the partitions below the heavy line run in pmode (privileged or protected mode.) The heavy line represents the boundary between unprivileged operation and privileged operation. This isolation is enforced by the Cortex-M processor architecture. It is safe and dependable, unless we do something wrong.

Above the heavy line are two application partitions and one middleware partition. Of course, an actual system is likely to have many more umode partitions. The goal here is to achieve complete isolation of one umode partition from another. Then penetrating one partition does not enable a hacker to penetrate others and thus the breach is contained. Each umode partition is a group of one or more utasks. The utasks form the basis of isolation from tasks in other partitions, but not from tasks in their own partition. umode partitions are capable of strong isolation. Hence, vulnerable code such as drivers, middleware, and application code should be put into umode partitions, whenever possible.

Below the heavy line are Secure Boot, pcode, smx, the SMX RTOS kernel, and Security partitions. These are comprised of pcode. Of course, an actual system may have many more pmode partitions. The goal is to also isolate pmode partitions from each other. However, this isolation is not as strong as umode isolation, as discussed later.

Secure Boot

When the system powers up or is rebooted, the processor comes up in pmode and it is in the Secure Boot partition.

Figure 2: Secure Boot

As illustrated in Figure 2, secure boot software does basic hardware and software initialization, it loads code, if necessary, it creates the tasks necessary to start operation, and then it starts the scheduler. Prior to starting the scheduler no tasks are running. After starting the scheduler, the system is running in task mode and the first task scheduled at the highest priority is running. Other partitions do their own initializations. Both for structural and security reasons, it is best to minimize the code in the secure boot partition. In Figure 2, the secure boot loader is shown in yellow. These are available from many sources and are outside of the scope of this paper. Code shown in green is system and application code.

smx

The SMX RTOS consists of the smx kernel plus middleware. SMX is split such that the smx kernel runs in pmode, whereas the SMX middleware runs in umode. MPU-Plus, when bundled with SMX, eheap™, and certain other products, constitutes SecureSMX™.

The smx partition contains the smx kernel and related software such as the SVC Handler and the PendSV Handler. It runs in pmode in order to strongly isolate it from umode partitions, which could have been corrupted. MPU-Plus extends smx to add security functions. For more information on these, see:

smx User's Guide, by Ralph Moore, Micro Digital, Inc.

smx Reference Manual, by Ralph Moore, Micro Digital, Inc.

We have found that, in addition to adding MPU-Plus, a surprising amount of modification to smx, itself, has been necessary, even though smx has been in use as an embedded kernel for 30 years! Middleware products also have required significant modification. Security seems to be bringing a paradigm shift to embedded systems software.

Security

Finally, there is the Security partition and the Vault. The Vault is where we store the jewels (encryption keys, passwords, authentication codes, certificates, etc.) and the cash (private data). If pmode is breached, the Vault springs open and the Kingdom is lost. Therefore, protecting the Vault is of paramount importance, and only the security partition, which contains crypto, authentication, and other security tasks is allowed access to the vault.

Encryption and authentication software have been moved out of SMX middleware products into the security partition. Thus, only security software has access to the Vault.

pcode

The pcode partition contains Interrupt Service Routines (ISRs), Link Service Routines (LSRs), and other code that must be in pmode. This is a mixture of system, middleware, and application code. In an actual system, this would probably be split into a system partition and an application partition. It may contain some ptasks as well as ISRs and LSRs. Likewise, the smx Error

Manager, smx_EM() and error recovery code are not tasks. Hence, most of the code in this partition runs in the context of the current task.

Handling interrupts presents special security problems, which are discussed in Part 3.

utasks

utasks can provide high levels of isolation. This is primarily because they cannot access the MPU. The MPU is loaded with the regions that a task is permitted to access, including access permissions (e.g. read-only, execute never, etc.), but the task can do nothing to change them. If the Background Region is on, it has no effect in umode. However, all is not peaches and cream – there are heap problems and function call problems, which are discussed later.

ptasks

The isolation provided by ptasks is weak compared to utasks. This is because once a ptask is breached, only one step is required for malware to either turn off the MPU or to turn on its Background Region (BR). Then the MPU regions have no effect. The MPU is defenseless in pmode, whereas it is impregnable in umode.

However, ptasks may help to thwart attackers by catching many hacking techniques (e.g. stack or buffer overflow, attempted execution from a stack or buffer, etc.) and triggering an MMF before the hacker gains actual control. The MMF handler can then delete the penetrated task and recreate it, hopefully with only a small hiccup in system operation. It can also report the incident, which is helpful to finding and reducing code vulnerabilities.

ptasks are also useful to catch programming errors and can be a useful step on the way to utasks.

Basic Operations

MPU Control

A Memory Protection Array (MPA) is a set of regions to be loaded into the MPU on a task switch; there is an MPA for each task. A task's index is used to find its MPA in the memory protection table, mpt[indexsmx_TaskCreate() copies the current (parent) task’s MPA to the task being created. If ct is a ptask, it can change the task’s MPA via:

     smx_TaskSet(task, SMX_ST_MPA, tp);

where tp points to the MPA template for the task.

The foregoing is illustrated in Figure 3. In this figure, MPA0, 1, and 2 share template mpa_tmplta. Hence the three corresponding tasks share the same regions. They are thus most likely in the same partition. Note that MPA3 uses template mpa_tmpltb. Hence, the corresponding task is most likely in a separate region. The fifth task has not yet been created, nor has its MPA been loaded.

Figure 3: Templates, MPAs, and TCBs

There are as many slots in the MPA as dynamic slots in the MPU. Most slots are filled with static regions defined in the linker command file (a tedious process). However, some slots have pointers to an array of regions that are dynamically created at run time. These are discussed more in Part 4. The top MPU in-use slot, which has highest priority, is reserved for the task stack region. The task stack is dynamically created from the main heap when a task created or obtained from the stack pool when the task is dispatched. While the task is running, any updates to the MPU are also made to its MPA so it is not necessary to save the MPU contents during a task switch.

Creating static regions is a laborious process. For example, for a code region it is necessary to identify all functions needed by a particular partition or a particular task, including subroutines. Pragmas are inserted into the code to put all of these into a unique code section, for example:

#pragma default_function_attributes = @ ".ut1a_text"

void tm05_ut1a(void)

{

   smx_SemSignal(sbr1);

}

#pragma default_function_attributes =

Then a block is defined in the linker command file to hold this and related sections, for example:

define block ut1a_code with size = 1024, alignment = 1024 {ro section .ut1a_text, ro section .ut1a_rodata};

Regions are defined in the linker command file and blocks are placed in them, for example:

define region ROM = mem:[from 0x00200000 to 0x002FFFFF];

place in ROM  {block t2a_code, ro section .tmplt, block ut1a_code, block ut2a_code, block ut2b_code};

Back in the code, a slot in the MPA is defined:

#pragma section = "ut1a_code"

MPA mpa_tmplt_ut1a =

{

...

   RGN(3 | RA("ut1a_code") | V, CODE | RSI("ut1a_code") | EN, "ut1a_code"),

...

};

All of this is done for one MPU region in one template – clearly a laborious process. Template macros (e.g. RGN()) are shown above that reduce the work and help to reduce errors. Because some of the statements are in the code and some are in the linker command file, the process is error prone. Not only that, it is very easy to leave out an obscure subroutine for a code region or variable for a data region, resulting in an annoying MMF during debug (a good reason to turn off MMFs during the early stages of debugging).

System Calls

ptasks can call all smx and system functions directly, but utasks cannot call them directly because they must execute in pmode. Instead, the SVC N instruction is used. For umode code, the xapi.h header file, which contains smx and system prototype functions, is replaced with the xapiu.h header file. The latter maps smx_ calls to smxu_ calls, which are shell functions that invoke SVC N, where N is the system call ID. However, restricted calls, which are prohibited for umode, generate Privilege Violation errors. Restricted calls can be made only by pcode. For example, smx_HeapInit() is not needed by utasks and could cause system harm, if called from malware, so there is no smxu_HeapInit(). A reasonable set of restricted calls is defined. However this set can be expanded or contracted, as necessary, for a specific application.

Figure 4: System Calls

Figure 4 illustrates the system call mechanisms for both utasks and ptasks. The SVC Handler uses N as an index through the SSR jump table to the smx System Service Routine (SSR). The SSR executes in pmode, then returns the result to the SVC Handler. The SVC Handler returns this result to the smxu shell function, which returns it to the utask. All of this detail is hidden from the caller, and it appears as if a normal function call were made. A system call that is not allowed in umode results in a branch to the Privilege Violation Error Manager (PVEM), which, in turn, calls the smx Error Manager (EM).

Note that an smx call from a ptask goes directly to an SSR, and there are no disallowed service calls.

The next part of this series will discuss partition problems including heap usage, function call APIs, interrupts, parent and child tasks, and task local storage. For more information see www.smxrtos.com/mpu.

Ralph Moore is President of Micro Digital. A graduate of Caltech, he has served many roles at Micro Digital since founding it in 1975. Currently, he is the lead architect and programmer for MPU-Plus, eheap, and smx.


[1] Link Service Routines are unique to smx. ISRs cannot make smx calls; they must invoke LSRs to do so. LSRs can make most smx calls but cannot wait. They operate at a higher priority than any task and provide a convenient method for deferred interrupt processing and to buffer peak interrupt loads.

[2] Parent and child tasks are discussed in Part 3.

[3] In rare cases, certain regions may be permanently assigned to top MPU slots and thus these slots do not appear in MPAs.

[4] Allowing permanent or temporary stacks for tasks is a unique feature of smx. Temporary stacks are used with one-shot tasks, which do not have infinite loops. Instead, they run once, stop, and release their stacks, thus permitting a few stacks to be shared among many one-shot tasks.

I am no longer running the daily business at Micro Digital. Instead, I have been involved for the past four years in improving the smx RTOS kernel. smx is a hard-real-time multitasking kernel, which is intended for embedded systems that require high efficiency and high performance.

More from Ralph

Categories
Processing