substr vs regex

Matt Lawrence matt.lawrence at virgin.net
Thu Sep 6 14:05:39 BST 2007


David Cantrell wrote:
> On Mon, Sep 03, 2007 at 11:44:53AM +0200, Abigail wrote:
>   
>> On Mon, Sep 03, 2007 at 10:26:12AM +0100, alex at owal.co.uk wrote:
>>     
>>> Imagine, say, someone wanted the last three characters of a string. They
>>> might use a regex /(...)$/ or substr($variable, -3)
>>>       
>> substr() is far more efficient than using a regexp.
>>     
>
> Aye.
>
>   
>>                                                     The speed of the
>> substr() solution is independent of the length of $variable
>>     
>
> Only true for constant-width character sets.  This is why UTF-8 is a
> stupid stupid design.
>
>   
I guess you could convert to UTF-32 for substr work. That way you have
constant width, but at the cost of your data being up to 4 times bigger
than is necessary. Simply multiply all arguments to substr by 4 and
you're away! ;-)


Matt



More information about the london.pm mailing list