Print chess symbols using UnicodeBlock?
Some chess symbol characters exist in the Miscellaneous Symbols block, but you are specifically checking for 16-bit char
values in a different block. The Chess Symbols block contains zero characters with 16-bit values; it starts at U+1FA00, and ends at U+1FA6F.
By casting to char
, you are trimming all values above U+FFFF to their lowest 16 bits; for example, if i
is 0x1fa60, casting it to a char
will make it 0xfa60, which prevents your block check from succeeding.
To make your code work, you need to stop assuming that all codepoints are 16-bit values. You can do that by changing this:
char unicode = (char) i;
to this:
int unicode = i;
Unfortunately Character.UnicodeBlock
doesn't have methods to tell what is the beginning and ending value for code points within the block. In Unicode 11 the chess symbols block runs from U+1FA00 to U+1FA6D.
Java uses UTF-16 and surrogate pairs to represent characters over U+10000. In this case code point U+1FA00 will be represented as two char
values: U+D83E (high surrogate) and U+DE60 (low surrogate).
You should use Character.toChars()
to correctly print the code point which is always an int
:
Character.UnicodeBlock block = Character.UnicodeBlock.CHESS_SYMBOLS;
for (int i = 0; i < 1114112; i++) {
if (Character.UnicodeBlock.of(i).equals(block)) {
System.out.println(Character.toChars(i));
}
}