SQL Server: Replace invalid XML characters from a VARCHAR(MAX) field
There is a trick using the implicit conversion of VARBINARY
to base64 and back:
Here your list of evil
DECLARE @evilChars VARCHAR(MAX)=
CHAR(0x0)
+ CHAR(0x1)
+ CHAR(0x2)
+ CHAR(0x3)
+ CHAR(0x4)
+ CHAR(0x5)
+ CHAR(0x6)
+ CHAR(0x7)
+ CHAR(0x8)
+ CHAR(0x9)
+ CHAR(0xa)
+ CHAR(0xb)
+ CHAR(0xc)
+ CHAR(0xd)
+ CHAR(0xe)
+ CHAR(0xf)
+ CHAR(0x10)
+ CHAR(0x11)
+ CHAR(0x12)
+ CHAR(0x13)
+ CHAR(0x14)
+ CHAR(0x15)
+ CHAR(0x16)
+ CHAR(0x17)
+ CHAR(0x18)
+ CHAR(0x19)
+ CHAR(0x1a)
+ CHAR(0x1b)
+ CHAR(0x1c)
+ CHAR(0x1d)
+ CHAR(0x1e)
+ CHAR(0x1f)
+ CHAR(0x7f);
This works
DECLARE @XmlAsString NVARCHAR(MAX)=
(
SELECT @evilChars FOR XML PATH('test')
);
SELECT @XmlAsString;
The result (some are "printed")
<test>�

</test>
The following is forbidden
SELECT CAST(@XmlAsString AS XML)
But you can use the implicit conversion of VARBINARY to base64
DECLARE @base64 NVARCHAR(MAX)=
(
SELECT CAST(@evilChars AS VARBINARY(MAX)) FOR XML PATH('test')
);
SELECT @base64;
The result
<test>AAECAwQFBgcICQoLDA0ODxAREhMUFRYXGBkaGxwdHh9/</test>
Now you've got your real XML including the special characters!
SELECT CAST(CAST(@base64 AS XML).value('/test[1]','varbinary(max)') AS VARCHAR(MAX)) FOR XML PATH('reconverted')
The result
<reconverted>�

</reconverted>