Regex teaser

Matt Lawrence matt.lawrence at virgin.net
Wed Dec 4 10:09:33 GMT 2013


On 04/12/13 07:55, Paul Makepeace wrote:
> On Tue, Dec 3, 2013 at 5:03 PM, Mark Fowler <mark at twoshortplanks.com> wrote:
>> On Tue, Dec 3, 2013 at 6:54 PM, Paul Makepeace <paulm at paulm.com> wrote:
>>
>>> $ perl -le '($a = "aabbb") =~ s/b*$/c/g; print $a'
>> This is where tools like Regexp::Debugger shine.  Running
>>
>>   perl -le 'use Regexp::Debugger; ($a = "aabbb") =~ s/b*$/c/g; print $a'
>>
>> Shows exactly why it gives the output it does (if you hit "n" for next a lot)
> Can't use an undefined value as an ARRAY reference at
> /Library/Perl/5.16/Regexp/Debugger.pm line 499.
>
> Glad we're not the only ones confused by it ;-) But yeah that's neat.
> I don't agree it shows WHY as much as HOW.
>
> The puzzle comes down to whether the $ is part of the first b*
> capture. IMO it is (and python seems to agree). Why the engine
> restarts having captured as much as it can to the very end strikes me
> as counter intuitive. Almost, if not actually, bug-like.
>

I don't think it does come down to the meaning of $, it comes down to 
the meaning of *.

Perl sees 5 b* groups in 'aabbb': the 'bbb' and one between each other 
character. The edge-case is whether or not a zero-width match can follow 
a non-zero-width match.

Matt




More information about the london.pm mailing list