Why is base128 not used?

Because some of those 128 characters are unprintable (mainly those that is below codepoint 0x20). Therefore, they can't reliably be transmitted as a string over the wire. And, if you go above codepoint 128, you can have encoding issues because of different encodings used across systems.


As already stated in the other answers, the key point is to reduce the character set to the printable ones. A more efficient encoding scheme is basE91 because it uses a larger character set and still avoids control/whitespace characters in the low ASCII range. The webpage contains a nice comparison of binary vs. base64 vs. basE91 encoding efficiency.

I once cleaned up the Java implementation. If people are interested I could push it on GitHub.

Update: It's now on GitHub.


The problem is that at least 32 characters of the ASCII character set are 'control characters' which may be interpreted by the receiving terminal. E.g., there's the BEL (bell) character that makes the receiving terminal chime. There's the SOT (Start Of Transmission) and EOT (End Of Transmission) characters which performs exactly what their names imply. And don't forget the characters CR and LF, which may have special meanings in how data structures are serialized/flattened into a stream.

Adobe created the Base85 encoding to use more characters in the ASCII character set, but AFAIK it's protected by patents.