How to detect machine word size in C/C++?

I'll give you the right answer to the question you should be asking:

Q: How do I choose the fastest hash routine for a particular machine if I don't have to use a particular one and it doesn't have to be the same except within a single build (or maybe run) of an application?

A: Implement a parametrized hashing routine, possibly using a variety of primitives including SIMD instructions. On a given piece of hardware, some set of these will work and you will want to enumerate that set using some combination of compile time #ifdefs and dynamic CPU feature detection. (E.g. you can't use AVX2 on any ARM processor, determined at compile time, and you can't use it on older x86, determined by the cpuinfo instruction.) Take the set that works and time them on test data on the machines of interest. Either do so dynamically at system/application startup or test as many cases as you can and hardcode which routine to use on which system based on some sniffing algorithm. (E.g. the Linux kernel does this to determine the fastest memcpy routine, etc.)

The circumstances under which you need the hash to be consistent will be application dependent. If you need the choice to be entirely at compile time, then you'll need to craft a set of preprocessor macros the compiler defines. Often it is possible to have multiple implementations that produce the same hash but using different hardware approaches for different sizes.

Skipping SIMD is probably not a good idea if you are defining a new hash and want it to be really fast, though it may be possible in some applications to saturate the memory speed without using SIMD so it doesn't matter.

If all of that sounds like too much work, use size_t as the accumulator size. Or use the largest size for which std::atomic tells you the type is lock free. See: std::atomic_is_lock_free, std::atomic::is_lock_free, or std::atomic::is_always_lock_free.


I think you want

sizeof(size_t) which is supposed to be the size of an index. ie. ar[index]

32 bit machine

char 1
int 4
long 4
long long 8
size_t 4

64 bit machine

char 1
int 4
long 8
long long 8
size_t 8

It may be more complicated because 32 bit compilers run on 64 bit machines. Their output 32 even though the machine is capable of more.

I added windows compilers below

Visual Studio 2012 compiled win32

char 1
int 4
long 4
long long 8
size_t 4

Visual Studio 2012 compiled x64

char 1
int 4
long 4
long long 8
size_t 8

Even in machine architecture a word may be multiple things. AFAIK you have different hardware related quantities:

  • character: generally speaking it is the smallest element that can be exchanged to or from memory - it is now almost everywhere 8 bits but used to be 6 on some older architectures (CDC in the early 80s)
  • integer: an integer register (e.g.EAX on a x86). IMHO an acceptable approximation is sizeof(int)
  • address: what can be addressed on the architecture. IMHO an acceptable approximation is sizeof(uintptr_t)
  • not speaking of floating points...

Let's do some history:

Machine class     |   character    |  integer    | address
-----------------------------------------------------------
old CDC           |     6 bits     |    60 bits  |  ?
8086              |     8 bits     |    16 bits  |  2x16 bits(*)
80x86 (x >= 3)    |     8 bits     |    32 bits  |  32 bits
64bits machines   |     8 bits     |    32 bits  |  64 bits    
                  |                |             |
general case(**)  |     8 bits     | sizeof(int) | sizeof(uintptr_t)

(*) it was a special addressing mode where the high word was shifted by only 8 bits to produce a 20 bits address - but far pointers used to bit 32bits long

(**) uintptr_t does not make much sense on old architecture because the compilers (when they existed) did not support that type. But if a decent compiler was ported on them, I assume that the values would be that.

But BEWARE: the types are defined by the compiler, not the architecture. That means that if you found an 8 bits compiler on a 64 machine, you would probably get sizeof(int) = 16 and sizeof(uintptr_t) = 16. So the above only make sense if you use a compiler adapted to the architecture...


Because the C and C++ languages deliberately abstract away such considerations as the machine word size, it's unlikely that any method will be 100% reliable. However, there are the various int_fastXX_t types that may help you infer the size. For example, this simple C++ program:

#include <iostream>
#include <cstdint>

#define SHOW(x) std::cout << # x " = " << x << '\n'

int main()
{
    SHOW(sizeof(int_fast8_t));
    SHOW(sizeof(int_fast16_t));
    SHOW(sizeof(int_fast32_t));
    SHOW(sizeof(int_fast64_t));
}

produces this result using gcc version 5.3.1 on my 64-bit Linux machine:

sizeof(int_fast8_t) = 1
sizeof(int_fast16_t) = 8
sizeof(int_fast32_t) = 8
sizeof(int_fast64_t) = 8

This suggests that one means to discover the register size might be to look for the largest difference between a required size (e.g. 2 bytes for a 16-bit value) and the corresponding int_fastXX_t size and using the size of the int_fastXX_t as the register size.

Further results

Windows 7, gcc 4.9.3 under Cygwin on 64-bit machine: same as above

Windows 7, Visual Studio 2013 (v 12.0) on 64-bit machine:

sizeof(int_fast8_t) = 1
sizeof(int_fast16_t) = 4
sizeof(int_fast32_t) = 4
sizeof(int_fast64_t) = 8

Linux, gcc 4.6.3 on 32-bit ARM and also Linux, gcc 5.3.1 on 32-bit Atom:

sizeof(int_fast8_t) = 1
sizeof(int_fast16_t) = 4
sizeof(int_fast32_t) = 4
sizeof(int_fast64_t) = 8