Databasen - Revisited

Wed Oct 18 21:39:18 BST 2006

>> I think I'd be wary of working for somebody who insisted this was
>> actually
>> better and considered this sort of hack a virtue.
> 
> Sorry, but when it comes down to performance, squeezing out every last
> drop is not a hack. Do you consider a schwarzian transform a hack too?

If your performance bottleneck is mapping months to their names, then
I'd say that you've done a pretty damn good job optimizing.  Go spend
your time on something useful, like posting flames to the london.pm list :)

> 
>> Any RDBMS worth its salt will skip the index if it's clearly faster to
>> do a
>> full table scan.
> 
> You prefer to force the optimiser to have to examine its choices than
> give it none?

The optimizer doesn't really have to make a choice here.  It loads the
page and says, "oh shit, there are no more pages.  the record must be in
here -- time to scan."  From an algorithmic standpoint, what else can
you do?

As an aside, this is why I like the Berkeley database -- you can specify
what sort of data structure you want.  In this case, you can use recno
and get faster lookups (read file starting from key * record_size) than
you would with btree or hashes.  Hashes are O(1) also, but
multiplication is probably faster than your hash function.  (BDB lets
you write your own hash function, though, so this isn't necessarily true
if your hash function is shift-and-mod or something.)

> (and yes, this is another one of the reasons why I think ORMs are bad).

This is another one of the reasons why I think people who dislike ORMs
are bad.  Instead of whining about some imagined performance difference
in the generated SQL, just fix the generated SQL.  Then instead of
solving the problem for every SQL statment you write, you can just
expect things to work (as can every other user of the module).  You're
too smart to waste your time hand-crafting SQL to get a 1% performance gain.

> It was a very specific example to try and get at very specific knowledge
> (fitting a table into a page). If you can think of a better way to get
> at that knowledge please let me know.

"How does the database store data?"  "Tell me about database pages." et
cetera.

Admittedly, I wouldn't have gotten your question right, and I've written
some pretty low-level database software.  I won't say that your question
was outright irrelevant, but the possibility exists.

>> So all you've actually achieved is a less robust schema
>> which may, ironically, cause the optimiser to make poorer decisions.
> 
> Please state when this would happen unless it's just postulating.

May, he said.  The point is you're reducing readability for maybe a 1%
performance gain.  Here's the reality -- hardware is getting cheaper
every day and programmer time is getting more expensive.  If you need a
new server, you can get one for $800 (or less these days).  If you need
another programmer to maintain your overly-optimized spaghetti, that's
going to cost you tens of thousands of dollars* a year!  That's a lot of
servers you could buy instead.

(* Should that be "pounds" instead?  I apologize for being an ignorant
American :/)

Regards,
Jonathan Rockway

-- 
package JAPH;use Catalyst qw/-Debug/;($;=JAPH)->config(name => do {
$,.=reverse qw[Jonathan tsu rehton lre rekca Rockway][$_].[split //,
";$;"]->[$_].q; ;for 1..4;$,=~s;^.;;;$,});$;->setup;