Finding the intersection between two regexes
david at cantrell.org.uk
Tue Apr 22 12:16:14 BST 2014
On Sun, Apr 20, 2014 at 10:14:48PM -0400, Mark Fowler wrote:
> On Sunday, April 20, 2014, David Cantrell <david at cantrell.org.uk> wrote:
> > Can anyone point me at some code on the CPAN that, given two regexes,
> > can figure out whether there are any bits of text that will be matched
> > by both?
> I'm not sure I understand the question here, or moreover why you want to do
> this..is it just an intellectual exercise?
I do actually have a use for it, which would help to explain the
A large part of Number::Phone is based on data in google's
libphonenumber project. That has, for most countries, regular
expressions that match valid fixed lines and valid mobiles. For some
countries those two regexes can both match some of the same numbers.
Here's the data:
If you look at the data for Barbados, they have for fixed lines:
and for mobiles:
then some strings will match both expressions - 2462303333, for example.
But if you look at the data for Jamaica there are no strings that match
At the moment I detect these overlaps (and then throw the regexes away
as being unfit for my purpose) by just going through each country's
number space. This is practical for NANP countries as I can do it
all with only about a million comparisons in the worst possible case. It
would be impractical to apply this to the whole world though.
David Cantrell | Bourgeois reactionary pig
More information about the london.pm