Understanding CPUs and the Business of CPUs Better

I’ve been reading Jon Stokes’ Inside the Machine, and it’s a very good read. In particular I was struck by a couple of simple aspects of how CPUs work.

ISA

First, let’s discuss ISAs (instruction set architecture). x86 is a famous one created by Intel. POWER is an ISA created by IBM. PowerPC was created by IBM, Motorola and Apple. ISAs stay may evolve, but stay relatively consistent (usually backwards compatible) as new CPU designs that use that ISA are created. For developers, think of the ISA as the API (application programming interface) of the chip. This is because, the implementation can vary drastically. For example, many x86 processors take the complicated instructions x86 allows for, and executes them as a series of sub instructions (RISC-like). As a programmer of any language (including assembly), you only care that the ISA is still the same, now how the work is done.

ISAs are disconnected from manufacturers. They can be licensed. While Intel comes to mind when you think of x86, AMD produces chips as well. ARM chips are licensed and produced by all kinds of manufacturers, including Qualcomm, Apple and more.

So what happens when a device or platform manufacturer changes processor? Let’s give a couple of examples. In an environment where backwards compatibility is paramount, it’s very hard to change ISAs. Microsoft has yet to do it, although the upcoming Windows 8 will support ARM. They will deal with the issue that Apple dealt with when the Macintosh line switched from PowerPC to x86/x64 chips. Apple had to provide a software compatibility layer (named Rosetta). Appropriately named, it translated the low level language of PowerPC instructions to x86/x64 instructions. Eventually, Apple made it’s development tools optionally support “Universal” binaries, so called “fat binaries” because they contain the instructions for both ISAs and the build of the operating system for each ISA knows how to select the correct portion of the binary for itself. Microsoft appears to be trying the simpler route of not providing translation for legacy applications to run on ARM. Still, it’s tool-chain going forward will have to provide builds for both ISAs. Presumably with some foresight, the installer could contain both binaries and install only the correct binaries. This would be valuable considering the ARM devices are presumably tablets where space considerations still matter. Regardless of the path forward, developers need to recompile all code to support the migration path forward, including 3rd party or shared libraries.

What about a more controlled platform, like a game console? For example, the original Sony PSP was an MIPS chip, while the PS Vita uses an ARM chip. In this case, the clear line between product generations makes the transition easier. Any code has to be recompiled, just like before. But that is more of an expected result among software makes for consumer devices. As new devices come with new high level APIs and operating system calls, and that is the real adjustment for a programmer making software on such consumer devices. If Sony does choose to support downloadable PSP games on the Vita, it will be on them to provide the compatibility later.

Microarchitecture and Processor Lines

Now that we understand that ISA doesn’t dictate implementation, it’s worth explaining that the actual implementation is called a microarchitecture. Changes in microarchitecture do not change the ISA. So for a counter-example, when the x86 ISA got MMX extensions, those resulted in new instructions. That is not a microarchitecture change, but an ISA change. The chip can execute the new required instructions any way it sees fit, MMX just means it handles those instructions. An example of a microarchitecture change is when Intel’s microarchitecture started using out of order execution of instructions to optimize the efficiency of loading instructions (and reduce bubbles, but that’s a longer topic for another time).

Microarchitecture changes can result in real performance differences. Various clever tricks like pipelining, branch-prediction and more can drastically improve the throughput of a processor without affecting it’s clockspeed. When one cheap leader seems to be in the lead in benchmarks, but the processor numbers (like speed and cache) are the same, it’s usually a sign that said vendor has a better microarchitecture at the moment.

With that in mind, it is much easier to decode the processor lines than it would first seem. Product names change a lot, but the microarchitectures stick around for a while. If you look at that info, you’ll find that the product lines that use the same microarchitecture differ by cost, cache size(s), clock speed, power consumption, transistor density, etc. It helps to look through a list like this of microarchitectures released by a company. Just be aware that some of the codenames are really just smaller versions of earlier microarchitectures. You’ll see a power/heat change in that case, but it’s largely just a manufacturing change.

Summary

So what does all this mean? Hopefully, when you see benchmarks, or discussions about major platform or tooling changes based on chip changes, it will make a bit more sense. And processor shopping should be a little easier if you understand the that once you zoom in a microarchitecture that you prefer, you can slide up and down the cost scale a bit based on clock speed, cache size, etc. Certainly this basic understanding has emphasized to me that clock speed isn’t everything. One only need see the benchmarks of two different microarchitectures to see how big the differences can be. For example, see this comparison of an Intel Core 7 (Nehalem microarchitecture) and AMD Phenom II (K10 microarchitecture). You can see real differences in there.

And finally, as you ready about various hardware configurations you should begin to recognize where certain ISAs fit as the best tool for the job. The pure efficiency and power of IBM’s Power ISA is the reason it still has such a stronghold in super-computing and other big-iron applications. While ARMs low power efficiency and flexibility makes it the clear leader in portable devices. Take the iPad, Apple’s A4 and A5 chips may sound like a new invention, but it is just a new implementation of the ARM ISA with an on chip GPU. Finally, x86’s desktop software library and price / performance balance have kept it king of the desktop computer for a long time running.

Interesting speculation to think about: With Windows 8 and OS X Lion both heading in a tablet friendly direction, and Windows 8 and Apple tablets running ARM, you have to wonder if Apple and Microsoft won’t move away from x86/x64 in order to simplify their developer tools by getting rid of the need to compile to two different ISAs.

Comments

One response to “Understanding CPUs and the Business of CPUs Better”

  1. Mike Lindegarde Avatar

    Interesting read. It will be interesting to see where things go in the next few years. When considering whether or not the ARM will ever completely replace the x86/x64 I think its important to consider what ARM processor is lacking: floating point operations.

    While dealing with floating point computations isn’t that important when dealing with 99% of the tasks most mobile devices handle, it can be quite important when doing some of the computationally intensive tasks servers and desktops are typically associated with. Fortunately for the ARM processor, GPUs are more or less evolving into math co-processors that can handle the FP operation. The fact that most devices now have a dedicated GPU does help the ARM based processors gain grown on the x86.

    I have no clue what the future holds, but it does seem fairly obvious that devices are becoming more and more specialized. As that happens, the number of places where you need a CPU that can handle anything starts to dwindle. With that in mind, I can see the popularity of the ARM processor easily surpassing the x86, but not necessarily replacing it.