2018. július 8., vasárnap

ARM vs x86 - the speed difference

Korona, a fellow OS developer asked me about the real speed differences bethwen x86 and ARM. I collect in points, what to expect:

1. Simple integer performance is very similar. ( algorithms like for(int d=0;d<10;d++) a+=b+c+d; )

2. But ARM lacks complex instructions. x86 can do memory access, stack access, and array address computation within a single instruction, ARM cant do this, as ARM is a LOAD/STORE architecture, which means it will have a separate address calculation instruction, then it will move data from/to the memory with  dedicated instruction. The actual mathetmatic operation is another instruction. This will result in more instructions, therefore, less performance. Modern ARM CPU-s are however superscalar CPU-s, so they are able to execute opcodes in parallel, and not all x86 opcode can be executed within a clock cycle as well, but ARM has a huge disadvantage when executing binary of a complex source code.

3. Floating point performance on ARM is horrible. On whitepaper, they will try to spell you with magical flops number, but in reality, a 4 core 45nm Snapdragon is barely able to exceed 200 megaflops per core, which is far below of the x86 performance (typical x86 CPU can almost execute as much of floating point instructions as integers per second). Therefore, you may wont use floating point calculations on speed-intensive parts of your code.

4. Benchmarks of this topic are fake and biased, they are usually created by lobbists and shills.

5. ARM lacks some instructions, for example, it will have no dedicated bswap instruction. This will cripple ARM on algorithms needing such type of instructions.

6. GCC tends to compile crappy code on ARM. And on Android, you will even have to use ARM v5 or v7 binaries for compatibility reasons. 10-15% of phones (the phones at the hands of the users, not the new ones) still have v5 CPU, which are unable to execute v7 or v8 code. Therefore, as a programmer, you must use v5, which could not even do unaligned memory access, or other simple tasks.

7. A typical 1 ghz A7/A15 ARM CPU (like a 45 nm snapdragon) will be 50-80% slower than a single core HT Intel N45x Atom on 45nm, but it highly depends on your algorithm. A modern, strong 14nm ARM v8 based CPU will be equivalent with the old 45 nm ATOM, but will lag behind a modern 14nm 4 core PentiumN by 50-80% again. According to my personal experiences with my crossplatform softwares, the difference usually tends to the larger end, but i'm writing and testing my codes on x86, so that maybe contributes to these large numbers.

8. However - ARM CPUs only efficient in the 2-3w range, low-power x86 CPU-s are only usable around 5-6w for general purposes, therefore the two different type of CPU not competes each other directly.

9. On cell phones, Android applications written in C/C++ will be compiled for ARM, and its impossible to offer a marketable x86 based device on that market due to this. Intel supplying an emulator to execute 32 bit ARM code on they x86 based Android phones, which is efficient in some applications (like games with igp-limit) but inefficient in other applications (when more computing power is needed). As Android platform is built around ARM more or less, the x86 boom on cell phones will never happen, especially as we alreday have ARM v8.



1 megjegyzés:

  1. I wanted to run DawnOS on my Raspberry Pi. Are you saying I should not do this? If it runs then it can be made in direct boot. I only have a 12v power system and there are not many x86 platforms running at 12v. Even the new Latte is 36w.

    VálaszTörlés