Lessons learned from experience – commonly from solving a problem – are often so much more effective than any amount of reading in books (or blogs!) – and this is how I learned about endianness.
It was 1981 (yes – nearly 40 years ago!) and we were building a system with a DEC PDP-11 minicomputer interfaced to a Texas TMS990 microprocessor via a shared memory. These were both 16-bit processors, so we passed data as words. But something odd was happening: one CPU would write a value into a word of shared memory, but when the other CPU read it out, the bytes were swapped. The fix for the problem was easy enough: just write a simple access routine on one side that swapped the bytes and ensure that this was always used for access to the shared memory. It was only later that were learned why this problem occurred.
In almost all modern embedded systems, memory is organized into bytes. CPUs, however, can also process data as 16- or 32-bit words. In this context, a decision needs to be made with regard to how the bytes in a word are stored in memory. There are two obvious options and a number of other variations. The property that describes this byte ordering is called "endianness" (or, sometimes, "endianity").
The two common forms of endianness are: least significant byte stored at lowest address ("little-endian") and most significant byte stored at lowest address ("big-endian"). There are other variations on byte ordering and even possibilities for how the bits are stored.
Broadly speaking, the endianness in use is determined by the CPU. Because there are a number of options, it is unsurprising that different semiconductor vendors have chosen different endianness for their CPUs. Intel CPUs have traditionally been little-endian; Freescale tended to favor big-endian. Most modern CPUs’ endianness can be swapped in software.
The questions, from an embedded software engineer’s perspective, are, "Does endianness matter?" and, "If so, how much?"
There are broadly two circumstances when a software developer needs to think about endianness:
- Data transmitted over a communications link or network
- Data handled in multiple representations in software
The former situation is quite straightforward – simply a matter of following or defining a protocol. The latter is trickier and requires some thought.
Consider this code:
unsigned int n = 0x0a0b0c0d;
unsigned char c, d, *p;
c = (unsigned char) n;
p = (unsigned char *) &n;
d = *p;
What values would c and d contain at the end? Whatever the endianness, c should contain the value 0x0d. However, the value of d will depend on the endianness. On a little-endian system d will contain 0x0d; on big-endian it will have the value 0x0a. The same kind of effect would be observed if a union were to be made between n and, say, unsigned char a.
So, does this matter? All those years ago, it mattered to me! With care, however, most code may be written to be independent of endianness and I would contend that almost all well-written code would be like this. However, if you do build in an endianness dependency, as I needed to do, good documentation and commenting is essential.