Subclassing HTML::Parser to support $p->include()
Andy Armstrong
andy at hexten.net
Fri Feb 24 17:42:44 GMT 2006
I've just sent this to libwww at perl.org but I imagine someone here
might have a bright idea :)
I'm using HTML::Parser as part of a templating system that parses
HTML formatted templates and interprets certain special tags. I'd
like to be able to implement a tag like
<include src="header.html" />
To do that I'd like to subclass HTML::Parser and add an include()
method that can be called in a tag handler and has the effect of
including a chunk of text in the parser's input. I need the included
text to appear in the HTML stream that HTML::Parser sees right after
the <include /> tag (so that the included text is in the right place).
My first thought is to provide a callback to $p->parse() that returns
the input text in chunks, breaking the text after each '>' - so that
the text immediately after each tag is in a new chunk. The $p->include
() method will tell the chunk-reading callback to read from the
included text up to EOF and then return where it left off in the
original text (there'll be an include stack of course so that nested
includes work).
For that to work I have to rely on HTML::Parser issuing a tag
callback as soon as it sees the closing character of a tag - if it
reads ahead then it will have already digested text beyond the
<include /> tag by the time it issues the callback for the <include /
> tage.
I can easily check what the current behaviour is - and I shall - but
there's no contract that I can see about the relationship between the
text that HTML::Parser has read via a callback and when the handlers
trigger. So even if it works now it could potentially change in the
future - I don't want to rely on undocumented behaviour.
So, is what I'm proposing sensible? If not is there a better way?
Assuming HTML::Parser currently behaves in the way I need it to is it
likely always to do so?
Thanks :)
--
Andy Armstrong, hexten.net
More information about the london.pm
mailing list