Databasen - Revisited

Matt Sergeant msergeant at messagelabs.com
Wed Oct 18 23:03:00 BST 2006


On 18-Oct-06, at 4:39 PM, Jonathan Rockway wrote:

>>> Any RDBMS worth its salt will skip the index if it's clearly  
>>> faster to
>>> do a
>>> full table scan.
>>
>> You prefer to force the optimiser to have to examine its choices than
>> give it none?
>
> The optimizer doesn't really have to make a choice here.  It loads the
> page and says, "oh shit, there are no more pages.  the record must  
> be in
> here -- time to scan."  From an algorithmic standpoint, what else can
> you do?

Why would it load the page the table is on first? It looks at the  
schema first.

>> (and yes, this is another one of the reasons why I think ORMs are  
>> bad).
>
> This is another one of the reasons why I think people who dislike ORMs
> are bad.  Instead of whining about some imagined performance  
> difference
> in the generated SQL, just fix the generated SQL.  Then instead of
> solving the problem for every SQL statment you write, you can just
> expect things to work (as can every other user of the module).  You're
> too smart to waste your time hand-crafting SQL to get a 1%  
> performance gain.

Or I'm smart enough to know that I don't want an ORM writing the SQL  
for me, and that DBI is already enough of an abstraction layer that I  
don't need another one.

>> It was a very specific example to try and get at very specific  
>> knowledge
>> (fitting a table into a page). If you can think of a better way to  
>> get
>> at that knowledge please let me know.
>
> "How does the database store data?"  "Tell me about database  
> pages." et
> cetera.

Probably a better way to ask the question. This is why I'm not in  
management :-)

>>> So all you've actually achieved is a less robust schema
>>> which may, ironically, cause the optimiser to make poorer decisions.
>>
>> Please state when this would happen unless it's just postulating.
>
> May, he said.  The point is you're reducing readability for maybe a 1%
> performance gain.

I'd change the table to drop the words "PRIMARY KEY". That's not a  
significant reduction in readability IMHO.

> Here's the reality -- hardware is getting cheaper
> every day and programmer time is getting more expensive.  If you  
> need a
> new server, you can get one for $800 (or less these days).  If you  
> need
> another programmer to maintain your overly-optimized spaghetti, that's
> going to cost you tens of thousands of dollars* a year!  That's a  
> lot of
> servers you could buy instead.

Here's another reality - despite advances in database replication  
there is still often a single point of access in most database  
applications. Optimising your database is critical for high  
performance (ask anyone who has built a large scale DB backed  
application). Even in a replicated setup, databases don't scale  
brilliantly across servers (too much replication overhead,  
application becomes too complex etc).

Matt.

______________________________________________________________________
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email 
______________________________________________________________________


More information about the london.pm mailing list