Is there a way to escape a CDATA end token in xml?
You have to break your data into pieces to conceal the ]]>
.
Here's the whole thing:
<![CDATA[]]]]><![CDATA[>]]>
The first <![CDATA[]]]]>
has the ]]
. The second <![CDATA[>]]>
has the >
.
Clearly, this question is purely academic. Fortunately, it has a very definite answer.
You cannot escape a CDATA end sequence. Production rule 20 of the XML specification is quite clear:
[20] CData ::= (Char* - (Char* ']]>' Char*))
EDIT: This product rule literally means "A CData section may contain anything you want BUT the sequence ']]>'. No exception.".
EDIT2: The same section also reads:
Within a CDATA section, only the CDEnd string is recognized as markup, so that left angle brackets and ampersands may occur in their literal form; they need not (and cannot) be escaped using "
<
" and "&
". CDATA sections cannot nest.
In other words, it's not possible to use entity reference, markup or any other form of interpreted syntax. The only parsed text inside a CDATA section is ]]>
, and it terminates the section.
Hence, it is not possible to escape ]]>
within a CDATA section.
EDIT3: The same section also reads:
2.7 CDATA Sections
[Definition: CDATA sections may occur anywhere character data may occur; they are used to escape blocks of text containing characters which would otherwise be recognized as markup. CDATA sections begin with the string "<![CDATA[" and end with the string "]]>":]
Then there may be a CDATA section anywhere character data may occur, including multiple adjacent CDATA sections inplace of a single CDATA section. That allows it to be possible to split the ]]>
token and put the two parts of it in adjacent CDATA sections.
ex:
<![CDATA[Certain tokens like ]]> can be difficult and <invalid>]]>
should be written as
<![CDATA[Certain tokens like ]]]]><![CDATA[> can be difficult and <valid>]]>