BerkeleyDB locking subsystem

Mon Jul 31 18:19:53 BST 2006

Hi Dirk,

this was very helpful. Some comments below.

> On Mon, 2006-07-31 at 16:41 +0200, Thomas Busch wrote:
> > Hi London.pm,
> > 
> > I'm using the CPAN package BerkeleyDB (not DB_File) to share
> > hashes between processes. The Berkeley environment I use for
> > this purpose is initialized with the flags DB_CREATE, DB_INIT_CDB,
> > and DB_INIT_MPOOL. It all works fine, except when one of the processes
> > crashes during a read or write, the Berkeley Hash gets
> > locked forever.
> 
> Here you have the great conundrum of using Berkeley, of any vintage or
> CPAN package. It drives me around the twist. One of the things you can
> do to mitigate your problem is to hook all the signals you can to make
> sure that you close your BDB connection before aborting. 

that's exactly what I did but perl doesn't allow to catch the kill -9
signal (probably with a reason). My problem occurs when a users
abruptly ends a CGI script.

> means perfect, but it cuts down a large portion of the problem.
> It is alleged that they have finally done something about this in the
> very latest release of Berkeley - but I have got so bitter and twisted
> with them that I have not tried nor tested it.
> 
> > 
> > Is there any way to avoid locking altogether when using concurrent
> > access to a BerkeleyDB, so that a crash of a process wouldn't affect
> > the BerkeleyDB ? If not is there a way to let locks expire. And if
> > this doesn't work what are my alternatives.
> 
> No. It uses a common memory pool and its own internal buffering system
> and it is this that causes all the corruptions and stuff. You can
> recover from a program failure, but only by disconnecting all the non
> failed processes and then running the recovery command. 

Could the following be of any help ? Have you looked into
this previously ?
http://www.sleepycat.com/docs/api_c/env_set_timeout.html

> Use sqlite 3.x (as a suggestion) in simple key and value mode (ie tables
> have one key field and one value field and you just use very simple DBI
> SQL to access it.
> 
> So far, it seems very reliable and it isn't noticeably slower than
> Berkeley. 

Do you know if sqlite 3.x can handle gigabytes of data ? The reason I'm
asking is that MySQL couldn't do the job. Also BerkeleyDB
allows to share FIFOs between processes which obviously you can
emulate with a SQL based table but it is clearly not as quick.

Thomas.