How to detect machine word size in C/C++?
I'll give you the right answer to the question you should be asking:
Q: How do I choose the fastest hash routine for a particular machine if I don't have to use a particular one and it doesn't have to be the same except within a single build (or maybe run) of an application?
A: Implement a parametrized hashing routine, possibly using a variety of primitives including SIMD instructions. On a given piece of hardware, some set of these will work and you will want to enumerate that set using some combination of compile time #ifdef
s and dynamic CPU feature detection. (E.g. you can't use AVX2 on any ARM processor, determined at compile time, and you can't use it on older x86, determined by the cpuinfo
instruction.) Take the set that works and time them on test data on the machines of interest. Either do so dynamically at system/application startup or test as many cases as you can and hardcode which routine to use on which system based on some sniffing algorithm. (E.g. the Linux kernel does this to determine the fastest memcpy
routine, etc.)
The circumstances under which you need the hash to be consistent will be application dependent. If you need the choice to be entirely at compile time, then you'll need to craft a set of preprocessor macros the compiler defines. Often it is possible to have multiple implementations that produce the same hash but using different hardware approaches for different sizes.
Skipping SIMD is probably not a good idea if you are defining a new hash and want it to be really fast, though it may be possible in some applications to saturate the memory speed without using SIMD so it doesn't matter.
If all of that sounds like too much work, use size_t
as the accumulator size. Or use the largest size for which std::atomic
tells you the type is lock free. See: std::atomic_is_lock_free
, std::atomic::is_lock_free
, or std::atomic::is_always_lock_free
.
I think you want
sizeof(size_t)
which is supposed to be the size of an index. ie. ar[index]
32 bit machine
char 1
int 4
long 4
long long 8
size_t 4
64 bit machine
char 1
int 4
long 8
long long 8
size_t 8
It may be more complicated because 32 bit compilers run on 64 bit machines. Their output 32 even though the machine is capable of more.
I added windows compilers below
Visual Studio 2012 compiled win32
char 1
int 4
long 4
long long 8
size_t 4
Visual Studio 2012 compiled x64
char 1
int 4
long 4
long long 8
size_t 8
Even in machine architecture a word may be multiple things. AFAIK you have different hardware related quantities:
- character: generally speaking it is the smallest element that can be exchanged to or from memory - it is now almost everywhere 8 bits but used to be 6 on some older architectures (CDC in the early 80s)
- integer: an integer register (e.g.EAX on a x86). IMHO an acceptable approximation is
sizeof(int)
- address: what can be addressed on the architecture. IMHO an acceptable approximation is
sizeof(uintptr_t)
- not speaking of floating points...
Let's do some history:
Machine class | character | integer | address
-----------------------------------------------------------
old CDC | 6 bits | 60 bits | ?
8086 | 8 bits | 16 bits | 2x16 bits(*)
80x86 (x >= 3) | 8 bits | 32 bits | 32 bits
64bits machines | 8 bits | 32 bits | 64 bits
| | |
general case(**) | 8 bits | sizeof(int) | sizeof(uintptr_t)
(*) it was a special addressing mode where the high word was shifted by only 8 bits to produce a 20 bits address - but far pointers used to bit 32bits long
(**) uintptr_t does not make much sense on old architecture because the compilers (when they existed) did not support that type. But if a decent compiler was ported on them, I assume that the values would be that.
But BEWARE: the types are defined by the compiler, not the architecture. That means that if you found an 8 bits compiler on a 64 machine, you would probably get sizeof(int) = 16
and sizeof(uintptr_t) = 16
. So the above only make sense if you use a compiler adapted to the architecture...
Because the C and C++ languages deliberately abstract away such considerations as the machine word size, it's unlikely that any method will be 100% reliable. However, there are the various int_fastXX_t
types that may help you infer the size. For example, this simple C++ program:
#include <iostream>
#include <cstdint>
#define SHOW(x) std::cout << # x " = " << x << '\n'
int main()
{
SHOW(sizeof(int_fast8_t));
SHOW(sizeof(int_fast16_t));
SHOW(sizeof(int_fast32_t));
SHOW(sizeof(int_fast64_t));
}
produces this result using gcc version 5.3.1 on my 64-bit Linux machine:
sizeof(int_fast8_t) = 1
sizeof(int_fast16_t) = 8
sizeof(int_fast32_t) = 8
sizeof(int_fast64_t) = 8
This suggests that one means to discover the register size might be to look for the largest difference between a required size (e.g. 2 bytes for a 16-bit value) and the corresponding int_fastXX_t
size and using the size of the int_fastXX_t
as the register size.
Further results
Windows 7, gcc 4.9.3 under Cygwin on 64-bit machine: same as above
Windows 7, Visual Studio 2013 (v 12.0) on 64-bit machine:
sizeof(int_fast8_t) = 1
sizeof(int_fast16_t) = 4
sizeof(int_fast32_t) = 4
sizeof(int_fast64_t) = 8
Linux, gcc 4.6.3 on 32-bit ARM and also Linux, gcc 5.3.1 on 32-bit Atom:
sizeof(int_fast8_t) = 1
sizeof(int_fast16_t) = 4
sizeof(int_fast32_t) = 4
sizeof(int_fast64_t) = 8