Quarantining crap HTML?

dave.lambley@gmail.com dave.lambley at gmail.com
Tue May 21 14:00:33 BST 2013


I did a thing about 10 years ago using HTML::TreeBuilder to remove elements and attributes which aren't on a whitelist.

Dave
------Original Message------
From: Dave Hodgkinson
Sender: london.pm-bounces at london.pm.org
To: London. pm Perl M[ou]ngers
ReplyTo: London.pm Perl M\[ou\]ngers
Subject: Quarantining crap HTML?
Sent: 21 May 2013 12:31

In keeping with the spirit of the list, this isn't directly a perl question
but it might be part of the solution.

I'm picking up HTML from another site, and that HTML is pretty crappy.

Is there any way of quarantining it so it doesn't bugger up the rest of the 
page?






-- 
Sent using from a tiny keypad.


More information about the london.pm mailing list