XML::LibXML and HTML (in >=v1.67)

peter@dragonstaff.com peter at dragonstaff.com
Wed Apr 1 11:17:38 BST 2009

Quoting Dave Cross <dave at dave.org.uk>:
> Toby Wintermute wrote:
> What you're trying to parse isn't XML. Therefore you shouldn't expect
> to be able to parse it with an XML parser.
>> Alternatively.. what do YOU use to parse real-world websites that are
>> often not totally valid?

A similar problem is when writing an XML editor you have to be able to  
parse incomplete/inconsistent XML to do code highlighting.
I use a combination of a C parser and perl regexp matching of XML tokens.
 From source at http://www.scintilla.org/ScintillaDownload.html:  

Regards, Peter

More information about the london.pm mailing list