RESTSharp has problems deserializing XML including Byte Order Mark?

I found the Solution - Thank you @arootbeer for the hints!

Instead of wrapping the XMLDeserializer, you can also use the 'RestRequest.OnBeforeDeserialization' event from #RESTSharp. So you just need to insert something like this after the new RestRequest() (see my initial code example) and then it works perfect!

request.OnBeforeDeserialization = resp =>
            {
                //remove the first ByteOrderMark
                //see: http://stackoverflow.com/questions/19663100/restsharp-has-problems-deserializing-xml-including-byte-order-mark
                string byteOrderMarkUtf8 = Encoding.UTF8.GetString(Encoding.UTF8.GetPreamble());
                if (resp.Content.StartsWith(byteOrderMarkUtf8))
                    resp.Content = resp.Content.Remove(0, byteOrderMarkUtf8.Length);
            };

I had this same problem, but not specifically with RestSharp. Use this:

var responseXml = new UTF8Encoding(false).GetString(bytes);

Original discussion: XmlReader breaks on UTF-8 BOM

Pertinent quote from the answer:

The xml string must not (!) contain the BOM, the BOM is only allowed in byte data (e.g. streams) which is encoded with UTF-8. This is because the string representation is not encoded, but already a sequence of unicode characters.

Edit: Looking through their docs, it looks like the most straightforward way to handle this (aside from a GitHub issue) is to call the non-generic Execute() method and deserialize the response from that string. You could also create an IDeserializer that wraps the default XML deserializer.

The solution that @dataCore posted doesn't quite work, but this one should.

request.OnBeforeDeserialization = resp => {
    if (resp.RawBytes.Length >= 3 && resp.RawBytes[0] == 0xEF && resp.RawBytes[1] == 0xBB && resp.RawBytes[2] == 0xBF)
    {
        // Copy the data but with the UTF-8 BOM removed.
        var newData = new byte[resp.RawBytes.Length - 3];
        Buffer.BlockCopy(resp.RawBytes, 3, newData, 0, newData.Length);
        resp.RawBytes = newData;

        // Force re-conversion to string on next access
        resp.Content = null;
    }
};

Setting resp.Content to null is there as a safety guard, as RawBytes is only converted to a string if Content isn't already set to a value.

RESTSharp has problems deserializing XML including Byte Order Mark?

Tags:

C#

Byte Order Mark

Restsharp

Related

Recent Posts