LWP output encoding
Struan Donald
lpm at exo.org.uk
Wed Nov 23 15:03:24 GMT 2005
* at 23/11 14:49 +0000 Andy Armstrong said:
> Googled. Can't figure. Can anyone update me on what the current
> semantics of HTTP::Response->decoded_content are?
>
> Specifics: I'm parsing a bunch of RSS feeds. I have two, both of
> which claim to be encoded UTF-8. I'm generating a hash for the
> contents of the feeds like this
>
> my $content = $res->decoded_content;
> my $hash = md5_base64($content);
>
> md5_base64() barfs on one of the feeds with
>
> "Wide character in subroutine entry"
I can't really answer the question of what it returns but I found
that I cured the same issues with Digest::MD5 by encoding the content
passed to it before to make sure that it's plain old octets and all
was well.
i.e one does:
my $content = encode( 'utf8', $res->decoded_content );
my $hash = md5_base64($content);
I am sure someone who understands more about this will be along to
explain why this is not a good idea...
s
More information about the london.pm
mailing list