Is Java 8 java.util.Base64 a drop-in replacement for sun.misc.BASE64?
I had same issue, when i moved from sun
to java.util.base64
, but then org.apache.commons.codec.binary.Base64
solved my problem
Here's a small test program that illustrates a difference in the encoded strings:
byte[] bytes = new byte[57];
String enc1 = new sun.misc.BASE64Encoder().encode(bytes);
String enc2 = new String(java.util.Base64.getMimeEncoder().encode(bytes),
StandardCharsets.UTF_8);
System.out.println("enc1 = <" + enc1 + ">");
System.out.println("enc2 = <" + enc2 + ">");
System.out.println(enc1.equals(enc2));
Its output is:
enc1 = <AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
>
enc2 = <AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA>
false
Note that the encoded output of sun.misc.BASE64Encoder
has a newline at the end. It doesn't always append a newline, but it happens to do so if the encoded string has exactly 76 characters on its last line. (The author of java.util.Base64
considered this to be a small bug in the sun.misc.BASE64Encoder
implementation – see the review thread).
This might seem like a triviality, but if you had a program that relied on this specific behavior, switching encoders might result in malformed output. Therefore, I conclude that java.util.Base64
is not a drop-in replacement for sun.misc.BASE64Encoder
.
Of course, the intent of java.util.Base64
is that it's a functionally equivalent, RFC-conformant, high-performance, fully supported and specified replacement that's intended to support migration of code away from sun.misc.BASE64Encoder
. You need to be aware of some edge cases like this when migrating, though.
There are no changes to the base64 specification between rfc1521 and rfc2045.
All base64 implementations could be considered to be drop-in replacements of one another, the only differences between base64 implementations are:
- the alphabet used.
- the API's provided (e.g. some might take only act on a full input buffer, while others might be finite state machines allowing you to continue to push chunks of input through them until you are done).
The MIME base64 alphabet has remained constant between RFC versions (it has to or older software would break) and is: ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz+/
As Wikipedia notes, only the last 2 characters may change between base64 implementations.
As an example of a base64 implementation that does change the last 2 characters, the IMAP MUTF-7 specification uses the following base64 alphabet: ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz+,
The reason for the change is that the /
character is often used as a path delimiter and since the MUTF-7 encoding is used to flatten non-ASCII directory paths into ASCII, the /
character needed to be avoided in encoded segments.