Web scraping frameworks?
James Laver
james.laver at gmail.com
Tue Mar 4 22:35:19 GMT 2014
On 4 Mar 2014, at 22:10, DAVID HODGKINSON <davehodg at gmail.com> wrote:
> For what I'm thinking, a way of relating named divs (and lists of) on
> a page to the hash elements needed for poking into DBIx::Class.
>
> As for Web::Scraper, it's Miyagawa-ware, so definitely worth looking
> at.
Sounds like what you actually want is a handful of app-specific lines of code around HTML::TreeBuilder. You can fetch with LWP (maybe LWP::Simple if your needs are small) or WWW::Mechanize for more complex stuff, or whatever else.
FWIW, last time I got involved in web scraping, this approach worked quite well and while it’s not immediately reusable, it’s pretty straightforward.
James
More information about the london.pm
mailing list