Regexp capture group list

Paul LeoNerd Evans leonerd at
Tue Nov 10 13:11:03 GMT 2009

I'm writing an attempt at a simple recursive-descent parser with no
backtracking or alternation, for parsing a really simple grammar.

My usual method is to write a collection of functions that eat a prefix
from the string they're passed as $_[0] (mutably so), and return any
interesting data. A basic primative to start with is something like:

 sub parse
    my ( $text, $re ) = @_;
    $_[0] =~ s/^$re// or die "Expected $re in $text...\n";

 sub parse_idspec
    parse $_[0], qr/ID\s+(\d+)/ and return $1;

I was rather annoyed to find that the regexp capture buffers $1, $2,
etc... are in fact dynamically scoped. This means that $1 can't escape
from parse(). It behaves as if 'local $1' was present in parse(); $1 in
parse_idspec() contains whatever it used to.

After some headscratching I decided instead to have parse() return a
list of the capture groups. I so far haven't found a neater expression

 sub parse
    my ( $text, $re ) = @_;
    $_[0] =~ s/^$re// or die "Expected $re in $text...\n";

    return map { substr $text, $-[$_], $+[$_]-$-[$_] } 1 .. $#+

This seems a common-enough idiom that perhaps there's a neater solution
- I find there's no @{^MATCHGROUPS} or similar present in perl...

Can anyone offer any neater suggestions?

Paul "LeoNerd" Evans

leonerd at
ICQ# 4135350       |  Registered Linux# 179460
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 190 bytes
Desc: Digital signature
Url :

More information about the mailing list