Web scraping frameworks?

James Laver james.laver at gmail.com
Tue Mar 4 22:35:19 GMT 2014

On 4 Mar 2014, at 22:10, DAVID HODGKINSON <davehodg at gmail.com> wrote:

> For what I'm thinking, a way of relating named divs (and lists of) on
> a page to the hash elements needed for poking into DBIx::Class.
> As for Web::Scraper, it's Miyagawa-ware, so definitely worth looking
> at.

Sounds like what you actually want is a handful of app-specific lines of code around HTML::TreeBuilder. You can fetch with LWP (maybe LWP::Simple if your needs are small) or WWW::Mechanize for more complex stuff, or whatever else.

FWIW, last time I got involved in web scraping, this approach worked quite well and while it’s not immediately reusable, it’s pretty straightforward.


More information about the london.pm mailing list