DBI, BLOBS, MySQL

Sun Jan 11 14:23:10 GMT 2009

Tim Sweetman writes:
>> "If a TEXT column is indexed, index entry comparisons are
>> space-padded at the end. This means that, if the index requires
>> unique values, duplicate-key errors will occur for values that
>> differ only in the number of trailing spaces. For example, if a
>> table contains 'a', an attempt to store 'a ' causes a duplicate-key
>> error"
>>
>> Genius. True genius.
>
> Isn't this generally true for VARCHAR fields?

CHAR, not VARCHAR; that's fundamentally the difference between CHAR
(aka "like a string, but broken") and VARCHAR (aka "like a string, but
broken in some implementations").

> Certainly MySQL does that

MySQL does that at least sometimes for VARCHAR, but I believe it's
considered a bug.  More specifically, trailing spaces (or their
absence) are correctly handled on retrieval, but not on comparison:

  > create temporary table t (id int primary key auto_increment, s varchar(255));
  Query OK, 0 rows affected (0.00 sec)

  > insert into t (s) values ('foo'), ('foo ');
  Query OK, 2 rows affected (0.00 sec)
  Records: 2  Duplicates: 0  Warnings: 0

  > select id, length(s) from t;
  +----+-----------+
  | id | length(s) |
  +----+-----------+
  |  1 |         3 |
  |  2 |         4 |
  +----+-----------+
  2 rows in set (0.00 sec)

  > select count(*) from t where s = 'foo';
  +----------+
  | count(*) |
  +----------+
  |        2 |
  +----------+
  1 row in set (0.00 sec)

> (More concerningly, I seem to recall that if you tell MySQL to index
> a TEXT field, the portion it indexes is only the first ~255 bytes).

And even if you explicitly say how much you want to be indexed,
there's a limit (2 kilobytes ish?  I forget) on the total length of
keys for a row.  Since that applies to UNIQUE indexes in the same
way, that means you can't actually use arbitrarily-long strings as
unique values in MySQL (or not without heroic workarounds in your
application code, anyway).

-- 
Aaron Crane ** http://aaroncrane.co.uk/