Adding and subtracting chars, why does this work?

Chars are in turn stored as integers(ASCII value) so that you can perform add and sub on integers which will return ASCII value of a char

From the Docs

The char data type is a single 16-bit Unicode character.

A char is represented by its code point value:

min '\u0000' (or 0)
max: '\uffff' (or 65,535)

You can see all of the English alphabetic code points on an ASCII table.

_{Note that 0 == \u0000 and 65,535 == \uffff, as well as everything in between. They are corresponding values.}

A char is actually just stored as a number (its code point value). We have syntax to represent characters like char c = 'A';, but it's equivalent to char c = 65; and 'A' == 65 is true.

So in your code, the chars are being represented by their decimal values to do arithmetic (whole numbers from 0 to 65,535).

For example, the char 'A' is represented by its code point 65 (decimal value in ASCII table):

System.out.print('A'); // prints A
System.out.print((int)('A')); // prints 65 because you casted it to an int

_{As a note, a short is a 16-bit signed integer, so even though a char is also 16-bits, the maximum integer value of a char (65,535) exceeds the maximum integer value of a short (32,767). Therefore, a cast to (short) from a char cannot always work. And the minimum integer value of a char is 0, whereas the minimum integer value of a short is -32,768.}

For your code, let's say that the char was 'D'. Note that 'D' == 68 since its code point is 68.

return 10 + ch - 'A';

This returns 10 + 68 - 65, so it will return 13.

Now let's say the char was 'Q' == 81.

if (ch >= 'A' && ch <= 'F')

This is false since 'Q' > 'F' (81 > 70), so it would go into the else block and execute:

return ch - '0';

This returns 81 - 48 so it will return 33.

Your function returns an int type, but if it were to instead return a char or have the int casted to a char afterward, then the value 33 returned would represent the '!' character, since 33 is its code point value. Look up the character in ASCII table or Unicode table to verify that '!' == 33 (compare decimal values).

This is because char is a primitive type which can be used as a numerical value. Every character in a string is encoded as a specific number (not entirely true in all cases, but good enough for a basic understanding of the matter) and Java allows you to use chars in such a way.

It probably allows this mostly for historical reasons, this is how it worked in C and they probably motivated it with "performance" or something like that.

If you think it's weird then don't worry, I think so too

The other answer is incorrect actually. ASCII is a specific encoding (an encoding is some specification that says "1 = A, 2 = B, ... , 255 = Space") and that is not the one used in Java. A Java char is two bytes wide and is interpreted through the unicode character encoding.

Regardless of how Java actually stores the char datatype, what's certain is this, the character 'A' subtracted from the character 'A' would be represented as the null character, \0. In memory, this means every bit is 0. The size in memory a char takes up in memory may vary from language to language, but as far as I know, the null character is the same in all the languages, every bit is equal to 0.

As an int value, a piece of memory with every bit equal to 0 represents the integer value of 0.

And as it turns out, when you do "character math", subtracting any alphabetical character from any other alphabetical character (of the same case) results in bits being flipped in such a way that, if you were to interpret them as an int, would represent the distance between these characters. Additionally, subtracting the char '0' from any other numeric char will result in int value of the char you subtracted from, for basically the same reason.

'A' - 'A' = '\0'
'a' - 'a' = '\0'
'0' - '0' = '\0'

Adding and subtracting chars, why does this work?

Tags:

Java

Related

Recent Posts