Why Perl needs a VM

Tue Sep 4 22:59:40 BST 2007

On Tue, Sep 04, 2007 at 03:34:00PM -0400, Matt Sergeant wrote:
>On 4-Sep-07, at 2:20 PM, ben at bpfh.net wrote:
>
>>I've had a poke at the code and sure enough, we're using  
>>XPathContext, which
>>I'd thought was a pure perl piece on top of XML::LibXML. It isn't -  
>>it's got
>>a C implementation at its heart.
>>
>>The Java implementation is still substantially quicker.
>
>Then you're doing something wrong. Or it's not the XPath part that's  
>slow.
>
>XML::LibXML is significantly faster than any Java implementation.
>
>http://www.xml.com/pub/a/2007/05/16/xml-parser-benchmarks-part-2.html

Matt, these benchmarks are very interesting - thanks for posting them.

Our typical use case is a document size of 2-10M, so these results go
some way to explaining what we're seeing - as that's the range where 
the results you pointed at show Java 1.5 or JDOM to start being faster 
than libxml2.

Of course, we should also remember that these benchmarks are strictly
for libxml2, rather than XML::LibXML. I would expect only a trivial 
additive constant time adjustment from Perl's string handling overhead, 
which would be lost in the noise of a 4M document, but it's probably
worth checking that assumption.

I'll have a proper look when I get some extra tuits - I'm particularly
interested in how sensitive these numbers are to the ratio of number 
of nodes to size of document, but this is a great signpost.

Cheers,

Ben