Protect and control software stored in flash memory

November 30, -0001 OpenSystems Media

3It's been said for every measure, there's a countermeasure, and that's also true for securing code in embedded systems. Sometimes a small device can be just the countermeasure needed to thwart cloning of flash contents.

Many systems use external standard flash memory chip(s) to store the operating program for processors that do not include embedded nonvolatile program storage. This is great because it allows easy flash memory expansion and software modification, perhaps in the manufacturing line as a customer download or during a maintenance operation. The downside is that the OEM loses control over the contents of the flash, potentially allowing unauthorized copies or modification.

It’s not just lost revenue to be worried about, however. If malware is downloaded into a system, the OEM’s reputation might be affected. In the case of systems like medical devices, the OEM might even be exposed to liability concerns.

Taking back security

Hardware security chips can help bring control back to the OEM. Programmable, highly secure smart card processors have been available for some time but require additional firmware to be written and can add unacceptable costs to the system. Hardware authentication chips, on the other hand, are turnkey devices that do not require internal programming or detailed knowledge of cryptographic algorithms and are modestly priced.

The way these chips work is pretty straightforward. The system microprocessor sends a challenge to the chip, then the chip uses a cryptographic algorithm to combine the challenge with a secret that is securely stored in nonvolatile memory. The response is then sent back to the system. The algorithm implemented inside the chip is chosen in such a way that an observer looking at the bus who can see both the challenge and the response can’t determine the value of the secret. Depending on how securely the chip stores the secret, it can be very difficult to copy a personalized chip like this.

While these chips can be used in all sorts of ways to add security to a system, two software protection features are of particular interest. The first, secure boot, provides a way to ensure that only authentic programs are executed while still permitting upgrades to happen. The second, anti-cloning, prevents unauthorized system builds or outright copies of the design.

Secure boot

System-on-Chip (SoC) devices often include a small boot ROM that contains the program used to initialize chip operation prior to executing the contents of the external flash memory. This boot ROM can be easily reprogrammed to work with an external authentication chip.

Prior to system shipment, the OEM stores a validating value in the flash memory alongside the program. This is computed by combining a digest of the program with a secret, a copy of which is stored in the authentication chip. A hash algorithm such as Secure Hash Algorithm 1 (SHA-1) or SHA-2 is used to generate the program digest. A hacker might be able to change the contents of the flash, but without knowing the secret, can’t generate a new validation value.

During execution of the code in the boot ROM, the microprocessor generates in real time a digest of the executable program stored in the flash memory (see Figure 1). This digest is then sent to the authentication chip as the challenge. The chip will combine the digest with its internally stored secret, and the response can be treated as a kind of program signature. If the response matches the validation value stored in flash, execution of the flash contents is allowed to continue; if not, the microprocessor can cycle to the downloader to wait for a valid flash image to be loaded.

21
Figure 1: A boot ROM can work with a microprocessor that generates a digest of the executable program store in flash and sends it to the authentication chip as the challenge.

This scheme can have a security weakness if a hacker can send modified software to the authentication chip, use a logic analyzer to read the response, and then store this validation value in the flash memory with the modified code. However, there are several ways to resolve this.

The best solution is to use an authentication chip that doesn’t return the expected validation value but instead takes it on input and returns a true/false to indicate a match. The digest usually will be too large and the chip too slow for an attacker to guess the correct validation value for modified code. For even greater security, the security chip can cryptographically combine a random challenge (or perhaps the current time or processor serial number) with the true/false and return that to the processor. This way, a simple switch kind of circuit modification can’t be used to spoof the processor.

Another method is to mechanically prevent access to the security chip’s pins. For an ASIC SoC, the security chip can be purchased in die form and integrated into the main package in a multidie package. Another way is to purchase the security chip in a package similar to a BGA, which doesn’t permit probing because the pads are completely hidden. Or the security chip on the board can be conformally coated with epoxy to prevent access.

In some cases, the system might be able to calculate the digest of the flash program using software in the boot ROM. However, validating the entire memory array at boot can be too time-consuming, especially for systems with larger flash memories. There are two ways to address this issue: incremental verification or hardware acceleration.

With an incremental verification scheme, only the module loader stored in the flash is verified using the boot ROM code. Prior to each new module being loaded for execution, the module loader performs the same validation procedure on that module using the authentication chip. The modules can also be validated in advance during idle time to improve event response performance.

Modern processors don’t always include a hardware hash engine, but Advanced Encryption Standard (AES) or Triple Data Encryption Standard (3DES) engines are quite common. By configuring the encryption engine to operate in Cipher-based Message Authentication Code (CMAC) mode, it’s easy to use these encryption algorithms to generate the program digest at hardware speeds.

Anti-cloning

Most OEMs now use subcontractors to build their devices. Consequently, systems sometimes are overbuilt for local sale or perhaps on the gray market. Alternatively, competitors or hackers might clone the system and sell it at a lower cost because they don’t have to invest in software development. Manufacturing costs can be reduced if the system uses only off-the-shelf components, but this makes unauthorized systems even easier to build.

Using hardware security chips can put an end to these clones without significantly increasing the size or cost of the system. Compiled into the embedded software are many tests for the presence of a properly programmed hardware security chip. The OEM controls the secret that is programmed into the chip and controls the distribution of the programmed chip to the subcontractors. As another option, the chip vendor can manage the personalization of the chip for the OEM.

There are several ways to implement these software tests. One simple method is to compile a challenge and an expected response in the software. If the security chip is missing or has the wrong secret, the response doesn’t match, and the system can be disabled or go back into download mode to get a corrected file. Add these checks in many places in the program, and they can be hard for a hacker to remove, especially when the code is verified by the ROM on initial load.

Other options for these software tests include distributing both the challenge generation and the response checking over various sections of the program. The response from the security chip can be used as the key for on-the-fly software module decryption. The response can be XOR’d with a separate constant and then used as a jump vector. If the security chip supports it, then multiple challenges can be sent from different sections of code and combined to generate a single response.

In a typical implementation, many different kinds of tests are included in the chip so that even if one mechanism is defeated, the others still do their job. Ideally, these tests depend on multiple secrets being stored in the security chip, ensuring that even if one secret value gets divulged, the overall system security is maintained.

Secret security

All of this doesn’t matter too much if it’s easy to get the secret out of the authentication chip. In this case, a hacker can create the correct software validation value or a system cloner can model the security chip with a simple microprocessor. Authentication chips protect the secret in at least two ways: using strong cryptographic algorithms and using special hardware chip design techniques to prevent direct or indirect attacks on the silicon.

In the past, some form of Linear Feedback Shift Register (LFSR), also known as a Cyclic Redundancy Check (CRC), was used as a hash algorithm. These were common due to their low cost of implementation, but with modern high-speed PCs, these algorithms can often be analyzed and broken in a short period of time.

LFSR/CRC algorithms are especially weak if the secret size is too small, as a brute force attack becomes possible using relatively simple software. There is no universal rule about what size is large enough, but most modern systems use secrets that are 128 bits or longer.

Right now, the SHA algorithms are the best choice for secure boot and anti-cloning. SHA-1 is secure enough today, but it has some known weaknesses and has been replaced by the SHA-2 family (including SHA-256 and SHA-512, among others). Because the lifetime of most embedded systems is measured in years, using the latest algorithm will ensure the security of the system even at the end of its useful life. (Editor’s note: We shot Figure 2 on July 1, 2010, from the National Institute of Standards and Technology website for your consideration. Note the 2010 date reference in this NIST Policy on Hash Functions.)

22
Figure 2: The National Institute of Standards and Technology delineates rules of use for SHA-2.

It’s also possible to purchase authentication chips that use public key (asymmetric) algorithms, which are typically slower and more complex. The software on the system side can also be much more complicated. As compared to authentication chips using the hash algorithms, they can increase the security of the secure boot scheme while offering little or no additional benefit against software cloning.

But a strong algorithm is not enough. Microprobers are easily purchased on eBay these days, so it’s important for the chip to protect against an attacker who might etch away the package and microprobe some internal nodes to get at these secrets. Modern chips prevent this with active internal shields over the entire chip, more than three layers of narrow-width metal, extra encryption on the internal blocks, and no exposed test pads.

Hackers also might try high or low voltage or excessive clock frequencies to get the authentication chip to reveal its secrets. These attacks can be defended against with internal tamper detectors that shut down the chip if an operation is attempted outside of the normal operating range. These are common security blocks, and most chip manufacturers add other proprietary security components beyond the usual tamper blocks.

Embedded implementation

Authentication chips in an embedded system can detect unauthorized modification or copying of system software stored in flash memory. In addition, they can be used in a variety of other ways to exchange session encryption keys, provide node authentication to a remote server, authenticate serial number storage, securely store manufacturing and/or maintenance history, and a wide variety of other security-related functions.

High-security authentication chips do not require designers to have any special cryptographic knowledge and can be integrated in embedded systems without affecting time to market. Usually found in small packages, they are suitable for even the most space-sensitive applications. One such chip is the Atmel AT88SA102S. It combines the SHA-256 algorithm with a 256-bit key length and an easy-to-use one-wire interface compatible with all microprocessors. The design includes an active shield over the entire circuit, tamper detectors, and encrypted internal memories.

Kerry Maletsky is business unit director of crypto products at Atmel Corporation, where he is responsible for new product definition, technical and strategic marketing, and product management for embedded secure processors and contactless smart card chips. A veteran of nearly 30 years in the semiconductor industry, Kerry has held a wide range of engineering and management positions. Prior to Atmel, he worked for Bell Laboratories and several start-ups including Simtek. He holds a number of patents relating to IC design and has given presentations and published articles in a variety of venues. Kerry holds a BS in Electronic Engineering and an MS in Computer Science from Stevens Institute of Technology.

Atmel Corporation
408-441-0311
www.atmel.com

Kerry Maletsky (Atmel Corporation)
Previous Article
Protecting data and IP with flash memory
Protecting data and IP with flash memory

Flash memory devices often do more than just store data - they offer specific features to secure data in em...

Next Article
Integrating static analysis with a compiler and database

Static analysis tools are becoming more integrated into the software development process. Saving data from ...

×

Stay updated on security-related design topics with the security edition of our Embedded Daily newsletter.

Subscribed! Look for 1st copy soon.
Error - something went wrong!