Why Perl needs a VM
Aaron Trevena
aaron.trevena at gmail.com
Wed Sep 5 13:16:18 BST 2007
On 05/09/07, Ismail, Rafiq (IT) <Rafiq.Ismail at morganstanley.com> wrote:
> <!-- outlook.. Apologies for toppost -->
outlook is no excuse ;)
http://home.in.tum.de/~jain/software/outlook-quotefix/
> For large feeds, depending on the structure and semantics of your
> documents, one approach you may want to consider is a combination of
> SAX/DOM parsing, where at runtime you DOM parse a 'reconstructed
> subtree' (at sax time) of your main document. This would contstrain the
> depth of your xpaths and the overall size of the DOM tree. For large
> trees, which require DOM parsing, it can be quite performant. You could
> also potentially parallelise your processing of a single document, via
> this approach..
>
> Depending on your requirements, you could also hold some state between
> subtree parses and inject nodes into the reconstructed tree - where
> there is a dependency on some previously parsed artifact..
>
> The likes of XML::Twig might also be worth looking at, but I don't know
> much about the underlying implementation/performance.
I'd be curious to know how much difference that made.
Certainly it could be worth appraising the way that the XML is being
processed - an inefficient algorithm at the top level will make a much
bigger difference than the speed of a lower level library.
A.
--
http://www.aarontrevena.co.uk
LAMP System Integration, Development and Hosting
More information about the london.pm
mailing list