Postgres Vacuum Cleaners

Raf rafiq at dreamthought.com
Mon Nov 20 14:43:13 GMT 2006


Thanks Domic and Peters.

This has really made things clear.

I went onto the box in question and can not find either the alleged log
file from the pg_autovacuum config or any pg_autovacuum daemons runnings.
I assume that it would be fair to assume the assumption that one may
assume that pg_autovacuum, for whatever reason, is probably not running?

Since I assume that pg_autovacuum works on the assumption that it enforces
incremental minimal 'ordinary' vacuum's and statistics rebuilds, do you
think that it would it make sense to do a FULL/CLUSTER VACUUM before
enabling pg_autovacuum for the first time?

Thanks,

Raf



On Mon, 20 Nov 2006, Dominic Mitchell wrote:

> On Mon, Nov 20, 2006 at 01:28:01PM +0000, Raf wrote:
> > Gents and Ladies,
> >
> > I've become rusty with postgres and figured that someone might be able
> > to shed some light.
> >
> > I am trying to understand postgres' vacuum functionality.  I've
> > started working with a DB, which I'm told uses "auto_vacuum."
> >
> > Now, to me this says little about the vacuuming policy, however, I
> > have seen the there are some very large files under the postgres data
> > device.
> > It seems that the size of these files exceeds the typical size of a
> > database dump, which to me suggests that there is still a physical
> > history of stale tuples in the DB and that it may need a full vacuum.
> > Googling has led me to believe that auto_vacuum only rebuilds stats /
> > does a vacuum analyse.  Thus, I don't believe that this would free up
> > space on the device?
>
> It's possible to free up some space, but not necessarily likely.
>
> Backing up a step, note that PostgreSQL has two forms of vacuum:
> "ordinary" vacuum and "full" vacuum.
>
> The ordinary vacuum goes through and marks out-of-date or deleted tuples
> (rows) as free space.  If it finds a full page (8kb) of contiguous free
> space, then it's possible to shrink the file it's using, but this is
> relatively rare.
>
> The full vacuum locks the entire table, and effectively writes out a new
> file containing no space at all, removing deleted and out-of-date tuples
> along the way.  Once upon a time (Pg 7.1), this was the only kind of
> vacuum.
>
> So a full vacuum gets you a smaller space in use.  *But*, it's not
> necessarily the right thing to do.  This is because there's inevitably
> some amount of "churn" in your database as rows get deleted and updated.
> So keeping around a bit of empty space in the file turns out to help
> matters, because you don't have to go off and ask the O/S for more
> space.
>
> In recent versions of PostgreSQL (8.1 and later I think), there is a
> tool called auto_vacuum.  This is a little daemon which runs alongside
> PostgreSQL, monitoring the update / insert / delete stats.  When so many
> operations have occurred, it triggers an ordinary vacuum for you.  This
> is (in general) a good idea, as it means you don't end up with dodgy
> cron scripts trying to guess when is a good time to do a daily vacuum.
>
>   NB: If you're running 7.4 or 8.0, all is not lost.  The auto_vacuum
>   daemon is in the "contrib" package and has to be stopped/started
>   separately.
>
> If you're wondering when a full vacuum might be useful, I'd say that
> it's handy when you have deleted most of the rows in a table and do not
> plan on refilling that table.  For example, in one of our databases, we
> went from "keep history data forever" to "keep 100 days of history".
> Performing a one-off full vacuum shrunk the space used in that table
> considerably.
>
> > Now, I was hoping that someone might be able to give be a brief
> > synopsis of the different vacuum arguments and their functionality.
> > I've tried googl'ing, but I'm hoping that this will be more
> > meaningful.  Further, is it correct to assume that while vacuuming the
> > db would be taken off line, or is there a way to expose it in a
> > query-only mode?
>
> The PostgreSQL documentation is actually quite reasonable.
>
>   http://www.postgresql.org/docs/8.1/static/maintenance.html#ROUTINE-VACUUMING
>
> The "ordinary" vacuum will let queries (and updates) run at the same
> time.  Although the vacuum can still be heavy on the disk I/O, so you
> may need to override the auto_vacuum daemon and run it manually at a
> time of day when the database is less busy.
>
> > The goal of my mission is to free up space and speed up queries.
> > Suggestions?
>
> To start with, use the auto_vacuum daemon until it becomes a problem.
> When it comes to speeding up queries, there are a few parameters  in
> postgresql.conf that can make a big difference to the speed.  I
> recommend this page for tuning PostgreSQL.
>
>   http://varlena.com/varlena/GeneralBits/Tidbits/perf.html
>
> It's a little bit old now, but it still seems to cover the salient
> points.
>
> I'd also recommend browsing the "runtime configuration" section of the
> manual, so you know what's available for tweaking.
>
>   http://www.postgresql.org/docs/8.1/static/runtime-config.html
>
> Oh, and there's a section on disk space usage as well.
>
>   http://www.postgresql.org/docs/8.1/static/diskusage.html
>
> -Dom
>


More information about the london.pm mailing list