Speed v Version
chris_jack at msn.com
Wed Jun 1 13:15:15 BST 2011
On 1 Jun 2011, at 09:46, Dirk Koopman wrote:
> I contemplating providing "encouragement" to a customer to upgrade from 5.8.7 to something more modern. One of the overriding issues is "speed". The customer is fixated with "speed".
> Unfortunately one of the major things the customer's clients do is "replicate" their ISAM data into databases, usually MS-SQL via DBI and DBD::ODBC. The ISAM data is always on Linux (and a few Unix) boxes. The replication is a batch process that must complete overnight. Currently it is a fairly close run thing.
I'm guessing you're mostly talking about Microsoft SQLServer here? I have more experience with Sybase - but as they started from a common code base, I'm guessing performance characteristics are similar. If you're doing a lot of bulk loading - then raw SQL and stored procs are likely to be slow. I have found even using the bcp module in perl is slow compared to using the bcp utility that ships with Sybase (and SQLServer): last time I mentioned marked it - it was about double the time (obviously relative to my hardware). Compared to raw inserts/stored procs, it was way way more.
If you're lucky, the loads are to single tables and are encapsulated in a module - in which case it might be straightforward to rewrite the module to fork off a process to run the bcp utility and feed it data via a pipe - or alternatively just write the data to a file and load it directly via bcp.
Oracle has sqlloader - and presumably other databases have similar bulk load utilities.
There may also be quick things you can tweak like network packet size in SQLServer/Sybase - depending on what I was doing this lead to from 1 to 23% improvement from 512 to 4096 bytes (I couldn't go higher at the time without getting the dba's to reconfigure the server). In Oracle, setting RowCacheSize to something like 10000 can improve performance far more dramatically than gives me any comfort about the quality of SQL*Net.
Can you give us a bit more of a clue about what the perl is doing. Is it just munging data into a format that can be readily loaded into a series of single table inserts?
More information about the london.pm