UTF-8 + HTML::Template + CGI::Fast

Peter Corlett abuse at cabal.org.uk
Fri Dec 4 17:15:21 GMT 2009


On 4 Dec 2009, at 15:19, Mark Fowler wrote:
[...]
> Let's assume you're sane and you've told your webserver to serve utf-8
> (and you've got a utf-8 header in the Content-Type) for the page the
> form is created from.  Most browsers will return you utf-8 in this
> situation.  Some will not (they are broken.)

As far as I could tell from the last time I had this problem, if you  
omitted the accept-charset attribute from the <form> tag, the browser  
would use its default character set. Which was UTF-8 in Firefox and  
Windows-1252 on IE. Setting <form accept-charset="utf8" ...> made IE  
play nicely.

If you've specifically asked for UTF-8 text, and the byte stream you  
receive is not a valid UTF-8 encoding, you can safely assume that it's  
actually Windows-1252 instead. Windows-1252 is a superset of Latin-1,  
so that assumption still holds true even if the client has sent Latin-1.

Getting something other than Latin-1, Windows-1252 or UTF-8 posted to  
your web forms is vanishingly unlikely.




More information about the london.pm mailing list