Advice on HTML editting

Ovid publiustemp-londonpm at yahoo.com
Thu Apr 29 11:58:34 BST 2010


----- Original Message ----
> From: Roger Burton West <roger at firedrake.org>


> What I tend to do in this sort of situation is 
> use HTML::TokeParser (which is just an alternative
> interface to HTML::Parser that matches better with
> the way I work) and re-emit everything except the
> tokens that I want to fiddle with. Not claiming this
> is the best way to go, but most of what I do is parsing
> HTML rather than modifying it, and I already use 
> HTML::TokeParser a lot.

May I suggest my HTML::TokeParser::Simple instead? Much easier to use and the task at hand is very straightforward:

     use HTML::TokeParser::Simple;
     my $parser = HTML::TokeParser::Simple->new( file => $doc );
     while ( my $token = $parser->get_token ) {
         if ( $token->is_start_tag('img') ) {
             my $href = $token->get_attr('href');
             $href =~ s/foo/bar/;
             $token->set_attr('href', $href);
         }
         print $token->as_is;
     }
   
That's about as easy as you can get and seems to satisfy OP's requirements.

Cheers,
Ovid

--
Buy the book - http://www.oreilly.com/catalog/perlhks/
Tech blog - http://blogs.perl.org/users/ovid/
Twitter - http://twitter.com/OvidPerl
Official Perl 6 Wiki - http://www.perlfoundation.org/perl6





More information about the london.pm mailing list