Web weirdness

Aaron Crane perl at aaroncrane.co.uk
Wed Jun 20 00:36:08 BST 2007


David Cantrell writes:
> Lots of URLs on my web site contain the ASCII string "&image", such as
> here:
>   http://shorterlink.org/2580
> 
> But sometimes that gets turned into some binary gibberish, like some of
> the photo links here:
>   http://nou.livejournal.com/102938.html?thread=1474842
> 

HTML has a named entity "ℑ" which corresponds to the character U+2111
BLACK-LETTER CAPITAL I.  And, lo, that's the funny-looking character that
appears in the broken places.

I'm not sure exactly where the brokenness is here, but I'm pretty sure that
something somewhere is failing to entity-encode this:

  .../photodetails.tt2?set=york-xmas-2005&image=commondale

to this:

  .../photodetails.tt2?set=york-xmas-2005&image=commondale

when it's used in HTML source.

Then, presumably, something else is converting named entities to the
characters they represent.

-- 
Aaron Crane


More information about the london.pm mailing list