Test::XML not working with UTF-8

John Ramsden-Developer John.Ramsden2 at bbc.co.uk
Thu Feb 1 11:39:21 GMT 2007

As an IT contractor (hi all - just joined the list), I've been tasked
with making any amendments and tests needed to ensure that a large
web application of my client works reliably with UTF-8.

This has turned out, not surprisingly, to be the perl equivalent of
climbing the north face of the Eiger with a hundredweight of pots
and pans tied round my waist!

(Not least because none of the editors on on the client's Solaris
system is UTF-8 aware - I've been using 'od -xc', the hex dumper,
and recommended they purchase SlickEdit, but any other suggestions
are welcome. Are there UTF-8 compliant versions of vim or emacs
for example?)

Anyway, getting to the point, I wonder if anyone has any ideas
why Test::XML fails to recognize UTF-8 characters, or can think
of an alternative I might use if Test::XML is no good for UTF-8.

The following script fails with an error when $smiley is set to
its UTF-8 byte sequence, but passes when it is set to ';-)'.

    use strict;
    use warnings;

    use utf8;
    use encoding 'utf8';      # may be the same as 'use utf8' ?!

    my $smiley = "\x{263A}";  # test works fine if smiley is ';-)'

    use Test::XML tests => 1;

    my $xml_found = '<?xml version="0.1234"
encoding="UTF-8"?><hack>smiley ' . $smiley . ' </hack>';

    my $xml_expected = $xml_found;

    if (! Test::XML::is_xml($xml_found, $xml_expected))
        print "Test::XML::is_xml() returned false!\n";


John R Ramsden

P.S. Is there a difference between 'use utf8' and 'utf encoding utf8'?
One of my colleagues reckons they are equivalent.

This e-mail (and any attachments) is confidential and may contain personal views which are not the views of the BBC unless specifically stated.
If you have received it in error, please delete it from your system.
Do not use, copy or disclose the information in any way nor act in reliance on it and notify the sender immediately.
Please note that the BBC monitors e-mails sent or received.
Further communication will signify your consent to this.

More information about the london.pm mailing list