Why are hexadecimal numbers prefixed with 0x?

Short story: The 0 tells the parser it's dealing with a constant (and not an identifier/reserved word). Something is still needed to specify the number base: the x is an arbitrary choice.

Long story: In the 60's, the prevalent programming number systems were decimal and octal — mainframes had 12, 24 or 36 bits per byte, which is nicely divisible by 3 = log2(8).

The BCPL language used the syntax 8 1234 for octal numbers. When Ken Thompson created B from BCPL, he used the 0 prefix instead. This is great because

  1. an integer constant now always consists of a single token,
  2. the parser can still tell right away it's got a constant,
  3. the parser can immediately tell the base (0 is the same in both bases),
  4. it's mathematically sane (00005 == 05), and
  5. no precious special characters are needed (as in #123).

When C was created from B, the need for hexadecimal numbers arose (the PDP-11 had 16-bit words) and all of the points above were still valid. Since octals were still needed for other machines, 0x was arbitrarily chosen (00 was probably ruled out as awkward).

C# is a descendant of C, so it inherits the syntax.


It's a prefix to indicate the number is in hexadecimal rather than in some other base. The programming language uses it to tell compiler.

Example:

0x6400 translates to 6*16^3 + 4*16^2 + 0*16^1 +0*16^0 = 25600. When compiler reads 0x6400, It understands the number is hexadecimal with the help of 0x term. Usually we can understand by (6400)16 or (6400)8 or whatever ..

For binary it would be:

0b00000001

Good day!


Note: I don't know the correct answer, but the below is just my personal speculation!

As has been mentioned a 0 before a number means it's octal:

04524 // octal, leading 0

Imagine needing to come up with a system to denote hexadecimal numbers, and note we're working in a C style environment. How about ending with h like assembly? Unfortunately you can't - it would allow you to make tokens which are valid identifiers (eg. you could name a variable the same thing) which would make for some nasty ambiguities.

8000h // hex
FF00h // oops - valid identifier!  Hex or a variable or type named FF00h?

You can't lead with a character for the same reason:

xFF00 // also valid identifier

Using a hash was probably thrown out because it conflicts with the preprocessor:

#define ...
#FF00 // invalid preprocessor token?

In the end, for whatever reason, they decided to put an x after a leading 0 to denote hexadecimal. It is unambiguous since it still starts with a number character so can't be a valid identifier, and is probably based off the octal convention of a leading 0.

0xFF00 // definitely not an identifier!

Tags:

C

Hex

Syntax