How to add a UTF-8 BOM in Java?

PrintStream#print

I think that out.write('\ufeff'); should actually be out.print('\ufeff');, calling the java.io.PrintStream#print method.

According the javadoc, the write(int) method actually writes a byte ... without any character encoding. So out.write('\ufeff'); writes the byte 0xff. By contrast, the print(char) method encodes the character as one or bytes using the stream's encoding, and then writes those bytes.

As noted in section 23.8 of the Unicode 9 specification, the BOM for UTF-8 is EF BB BF. That sequence is what you get when using UTF-8 encoding on '\ufeff'. See: Why UTF-8 BOM bytes efbbbf can be replaced by \ufeff?.


BufferedWriter out = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(...), StandardCharsets.UTF_8));
out.write('\ufeff');
out.write(...);

This correctly writes out 0xEF 0xBB 0xBF to the file, which is the UTF-8 representation of the BOM.


Just in case people are using PrintStreams, you need to do it a little differently. While a Writer will do some magic to convert a single byte into 3 bytes, a PrintStream requires all 3 bytes of the UTF-8 BOM individually:

    // Print utf-8 BOM
    PrintStream out = System.out;
    out.write('\ufeef'); // emits 0xef
    out.write('\ufebb'); // emits 0xbb
    out.write('\ufebf'); // emits 0xbf

Alternatively, you can use the hex values for those directly:

    PrintStream out = System.out;
    out.write(0xef); // emits 0xef
    out.write(0xbb); // emits 0xbb
    out.write(0xbf); // emits 0xbf

To write a BOM in UTF-8 you need PrintStream.print(), not PrintStream.write().

Also if you want to have BOM in your csv file, I guess you need to print a BOM after putNextEntry().