Adding and subtracting chars, why does this work?
Chars are in turn stored as integers(ASCII value) so that you can perform add and sub on integers which will return ASCII value of a char
From the Docs
The char data type is a single 16-bit Unicode character.
A char
is represented by its code point value:
- min
'\u0000'
(or 0) - max:
'\uffff'
(or 65,535)
You can see all of the English alphabetic code points on an ASCII table.
Note that 0 == \u0000
and 65,535 == \uffff
, as well as everything in between. They are corresponding values.
A char
is actually just stored as a number (its code point value). We have syntax to represent characters like char c = 'A';
, but it's equivalent to char c = 65;
and 'A' == 65
is true.
So in your code, the chars are being represented by their decimal values to do arithmetic (whole numbers from 0 to 65,535).
For example, the char 'A'
is represented by its code point 65
(decimal value in ASCII table):
System.out.print('A'); // prints A
System.out.print((int)('A')); // prints 65 because you casted it to an int
As a note, a short
is a 16-bit signed integer, so even though a char
is also 16-bits, the maximum integer value of a char
(65,535) exceeds the maximum integer value of a short
(32,767). Therefore, a cast to (short)
from a char
cannot always work. And the minimum integer value of a char
is 0, whereas the minimum integer value of a short
is -32,768.
For your code, let's say that the char
was 'D'
. Note that 'D' == 68
since its code point is 68
.
return 10 + ch - 'A';
This returns 10 + 68 - 65
, so it will return 13
.
Now let's say the char was 'Q' == 81
.
if (ch >= 'A' && ch <= 'F')
This is false since 'Q' > 'F'
(81 > 70
), so it would go into the else
block and execute:
return ch - '0';
This returns 81 - 48
so it will return 33
.
Your function returns an int
type, but if it were to instead return a char
or have the int
casted to a char
afterward, then the value 33
returned would represent the '!'
character, since 33
is its code point value. Look up the character in ASCII table or Unicode table to verify that '!' == 33
(compare decimal values).
This is because char is a primitive type which can be used as a numerical value. Every character in a string is encoded as a specific number (not entirely true in all cases, but good enough for a basic understanding of the matter) and Java allows you to use chars in such a way.
It probably allows this mostly for historical reasons, this is how it worked in C and they probably motivated it with "performance" or something like that.
If you think it's weird then don't worry, I think so too
The other answer is incorrect actually. ASCII is a specific encoding (an encoding is some specification that says "1 = A, 2 = B, ... , 255 = Space") and that is not the one used in Java. A Java char is two bytes wide and is interpreted through the unicode character encoding.
Regardless of how Java actually stores the char
datatype, what's certain is this, the character 'A'
subtracted from the character 'A'
would be represented as the null
character, \0
. In memory, this means every bit is 0
. The size in memory a char
takes up in memory may vary from language to language, but as far as I know, the null
character is the same in all the languages, every bit is equal to 0
.
As an int
value, a piece of memory with every bit equal to 0
represents the integer value of 0.
And as it turns out, when you do "character math", subtracting any alphabetical character from any other alphabetical character (of the same case) results in bits being flipped in such a way that, if you were to interpret them as an int
, would represent the distance between these characters. Additionally, subtracting the char '0'
from any other numeric char will result in int value of the char you subtracted from, for basically the same reason.
'A' - 'A' = '\0'
'a' - 'a' = '\0'
'0' - '0' = '\0'