Web weirdness

Aaron Crane perl at aaroncrane.co.uk
Wed Jun 20 00:36:08 BST 2007

David Cantrell writes:
> Lots of URLs on my web site contain the ASCII string "&image", such as
> here:
>   http://shorterlink.org/2580
> But sometimes that gets turned into some binary gibberish, like some of
> the photo links here:
>   http://nou.livejournal.com/102938.html?thread=1474842

HTML has a named entity "ℑ" which corresponds to the character U+2111
BLACK-LETTER CAPITAL I.  And, lo, that's the funny-looking character that
appears in the broken places.

I'm not sure exactly where the brokenness is here, but I'm pretty sure that
something somewhere is failing to entity-encode this:


to this:


when it's used in HTML source.

Then, presumably, something else is converting named entities to the
characters they represent.

Aaron Crane

More information about the london.pm mailing list