What is character 0x1f?
US means "Unit separator". This is an invisible character, so you should open your text file with some text editor that can show the invisible characters and remove them. I think that probably Notepad++ will give you this functionality:
http://notepad-plus-plus.org/
0x1f is a Unit Separator, an archaic way to separate fields in a text (Like ,
or Tab
in CSV).
It is indeed not a valid text character in XML 1.0 (but allowed in XML 1.1). In a UTF-8 input string, you can also safely replace the byte 0x1f
with 0x09
(Tab) to work around the problem. Alternatively, declare the document as XML 1.1 and use an XML 1.1 parser.