character set detection?
ben at bpfh.net
Sun Jan 7 13:39:00 GMT 2007
On Sun, Jan 07, 2007 at 01:13:12PM +0000, Dominic Mitchell wrote:
>Taht is,treat the input as UTF-8 by default (which *is* reliably
>recognisable, and also catches plain ASCII), and failing that, treat it
>as Windows-1252, which is (more-or-less) a superset of ISO-8859-1.
Um. UTF-8 has some multi-byte characters which are also valid ISO-8859-1,
I believe, although this is a corner case.
More information about the london.pm