[OT] xml encoding

Dominic Mitchell dom at happygiraffe.net
Fri Jan 6 16:24:47 GMT 2006


On Fri, Jan 06, 2006 at 04:11:40PM +0000, Dirk Koopman wrote:
> On Fri, 2006-01-06 at 15:40 +0000, Dirk Koopman wrote:
> > Preventing that (by doing things more "manually" or using
> > xmlNewTextChild()) produces output like the first example.
> 
> Actually, it doesn't (because I can't even edit emails properly anymore
> it seems), it carefully escapes the '&' characters to '&',
> producing:
> 
> <PASSWORD>rs&amp;#16;&amp;#30;&amp;#25;*  &amp;#6;</PASSWORD>
> 
> *instead* of 
> 
> <PASSWORD>rs&#16;&#30;&#25;*  &#6;</PASSWORD>
> 
> which is probably what I want.

But this is disallowed.  You can't have (say) &#6; because it's not a
valid XML character.  As many other people have mentioned, you really
need to base64 encode (or perhaps URI encode for small amounts) your
data.

    http://xml.com/axml/testaxml.htm

Which shows that:

Char ::= #x9 | #xA | #xD
         | [#x20-#xD7FF]
         | [#xE000-#xFFFD]
         | [#x10000-#x10FFFF] /* any Unicode character, excluding the
                               * surrogate blocks, FFFE, and FFFF. */

-Dom


More information about the london.pm mailing list