Encryption 101: Choosing the right scheme

December 9, 2014 OpenSystems Media

This intro to encryption offers some of the pitfalls that can derail the inexperienced user.

About six years ago, I started working with a UK crypto company that licenses encryption intellectual property (IP) cores. At the time, I confess, I knew little about encryption, and I’m still far from an expert. However, there are some common misunderstandings and misconceptions on the subject that I’m happy to share, and I’ll explain which type of encryption is best for your application (If you’re already an encryption expert, stop here).

The first question to consider is why use a hardware-based solution when a processor can efficiently run algorithms like Advanced Encryption Standard (AES)? There are two main reasons. The software solution is often more accessible to an attacker, who can try to intercept or alter the data or program. Software is more vulnerable to malware such as viruses and Trojan horses. It may also include an operating system making it large and complex so that it offers many opportunities to an attacker and can therefore be difficult to analyse for security weaknesses.

In contrast, the IP running on an FPGA is physically fully internal. Isolating the encryption code in fixed hardware and never allowing the software to come in contact with critical security parameters such as keys makes it easier to analyse the system’s overall security and many classes of threat are eliminated. The design should have only “plaintext” (i.e., message-in-the-clear) and fully encrypted data (i.e., ciphertext) going anywhere near the device’s pins.

The second reason is performance. Often, software simply can’t offer sufficient performance or requires a costly high-performance processor to do the task. FPGAs excel at massively parallel processing, and it’s easy to get duplex throughputs of 10 Gbps with modest clocks. Using faster silicon, higher clock speeds and more FPGA resources can push the throughput up to 100 Gbps or beyond. That’s a couple of orders of magnitude faster than a software solution. The FPGA hardware solution will also burn less power, which can be an important consideration in many applications.

There are fundamentally different ways that data is encrypted, say, for Internet banking and for streaming data on a network. The two common forms of cryptography are public and private key ciphers. Public key ciphers are useful for securely exchanging small amounts of information, whereas private key ciphers are used for securing larger volumes of data. AES, a private key scheme, is a symmetric-key, block cipher. In other words, the same key is used to encrypt and decrypt blocks of data. This means that there needs to be communications between the two ends about which key to use.

This introduces the subject of encryption keys. AES typically uses either 128- or 256-bit keys (there’s an option for 192 bits, but nobody uses it). Encryption using 256-bit keys is twice the key length of 128, but does it result in an IP core twice the size? The answer is no. With a 128-bit key, the plaintext undergoes ten “rounds” or transformations to produce the encrypted output, while a 256-bit key needs 14. So the core is roughly 40 percent bigger for the same throughput, while the latency increases by a few more clock ticks. But 256-bit keys give a huge increase in the number of possible permutations (3.4E38 more). It’s important not to regard the number of key permutations as a complete measure of overall security. If someone sets out to compromise the security in your system, there are various attack methods that may be tried. Lots of users decide that 128-bit keys are more than adequate. It just depends on the application.

The good news for someone new to cryptography is that AES is pretty much the only private key cipher you need to know about as it has largely displaced its competitors for new designs. The bad news is that the basic AES system offers a choice of modes and selecting the right one is vital. These form the usual alphabet soup beloved by engineers (ECB, CBC, CFB1, CFB8, CFB128, OFB, and CTR). Fortunately, only two of these modes (CBC and CTR) are used much.

It’s important, but no so straightforward, to choose the correct one. For example, electronic codebook (ECB) should not be used for data patterns where there might be lots of repeated patterns, such as in video or pictures due to how ECB mode is organized internally. ECB mode can be thought of as a virtual codebook with 2128 random entries in which a 128-bit input block is looked up, resulting in a 128-bit output block. Each key corresponds to a different codebook or “one-time pad” and knowledge of the output block gives no information about the input block.

In ECB mode, a particular input block will always result in the identical output block no matter where it occurs in the data stream being encrypted. ECB mode should be viewed as a block for building more complex modes of the cipher, and the configuration can be doubled up to provide greater throughput.

In CTR mode encryption, the output of a counter is encrypted and the resulting output XORed with the plaintext to form the ciphertext. The counter is then incremented for the next data block. The decryption operation is identical because the ciphertext is XORed with the encrypted counter output to form the plaintext.

There are two specific issues to be aware of with CTR mode. First, the counter must never be allowed to wrap around as each block must be encrypted with a unique counter value or an attacker can exploit this situation and privacy is compromised. Second, because there’s a direct bit-to-bit mapping between plaintext and ciphertext, an attacker can flip one bit in the ciphertext and know that the corresponding bit in the decrypted plaintext will change value. To prevent this, CTR mode is generally combined with an authentication algorithm to detect tampering.

Another “gotcha” is that cipher-block chaining (CBC) mode isn’t suitable for pipelining because it needs to wait until a data block is completely processed before it can continue (pipelining is covered in more detail later). If you’re unsure of which mode to use, talk to your IP core provider.

Data throughput and FPGA resources

I hinted earlier that data throughput and FPGA resources are related, generally in two ways – by varying the data-path width through the encryption unit or adding pipelining. The algorithm operates on 128-bit data blocks, so it’s possible to operate with a narrow data path, say 32 bits wide, in multiple stages and save silicon resources. That works fine for data rates up to around 500 Mbps because the clock needed by the core is achievable on an FPGA. Widening the data path to the maximum 128 bits uses more resources, but quadruples the throughput for the same clock. When this still isn’t enough, some AES modes can use parallel processing by adding pipelines. Again, this is an example of trading FPGA silicon for performance.

Let’s step back from too many details about the IP cores to consider whether you should even be using a freely available encryption scheme. You might think that if a potential attacker knows exactly how a message is encrypted, his job is made easier. The alternative is to use secret encryption algorithms, sometimes called “security by obscurity.”

There are some major benefits in using published methods. The National Institute of Standards and Technology (NIST) is the agency that recommends the AES specification (called FIPS197 and SP800-38A). This came after an evaluation of suggestions and the proposals that looked for potential weaknesses. In fact, some loopholes have been closed by new recommendations. The widely publicized (but unrelated) revelation about the Heartbleed bug in the open-source OpenSSL actually shows that no process is perfect. The big difference is that NIST invites comments, but doesn’t allow reviewers to modify the algorithm. So, on balance, having detailed analysis of various attack methods makes the algorithm stronger.

Read Part 2 – Encryption 201: Static versus dynamic data

Paul Dillien has worked with Algotronix Ltd. covering sales and marketing for the past six years. He previously worked in the FPGA industry, and is the author of The FPGA Market report. Paul is a Chartered Engineer and has worked in strategic and tactical marketing roles for leading U.S. and UK semiconductor companies and has specializations in competitive analysis and negotiation.

Paul Dillien, Algotronix Ltd.
Previous Article
Verifying embedded designs with cloud computing
Verifying embedded designs with cloud computing

Many industries have recognized the value of cloud computing both in terms of cost reduction through shared...

Next Article
Embedded virtualization enables scalability of real-time applications on multicore
Embedded virtualization enables scalability of real-time applications on multicore

Embedded virtualization combats the multicore complex: Global object networking allows real-time processor ...