Why does Char have an instance for Bounded?
Character encodings are tricky. Behind the scenes, all characters are represented by numbers. The Unicode standard provides a set of "code points" which are simply numbers which map to a particular sequence of real characters. Unicode defines code points between 0 and 1114111 and so that's what you see when you try maxBound
.
Char
encodes Unicode code points as individual integers, which is somewhat inefficient. If you want an efficient encoding, use Text
.
You're seeing \1114111
displayed because that's the code point that maxBound :: Char
represents and there is no more efficient, meaningful way to display it. In particular, it's in the "Supplementary Private Use Area-B" of the Unicode standard which means that it's reserved for use outside of the scope of Unicode and thus has no standard meaning.
All characters, like all things in a computer, are ultimately just numbers. Char
represents unicode characters, which are represented via numbers. You can convert between Char
and Int
values with ord
and chr
. E.g. the unicode value for a
is 97, so ord 'a'
is 97
and chr 97
is 'a'
.
Char '\1114111'
is the Char
that represents the number 1114111
, or 0x10FFFF, which is defined as a noncharacter. This is the largest value that is defined in Unicode, and is the largest that Haskell supports: '\1114112'
will cause a compile error.