Home > Archive > PostgreSQL Discussion > May 2005 > pg_dump in a production environment









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author pg_dump in a production environment
Thomas F. O'Connell

2005-05-23, 8:24 pm

I have a web application backed by a PostgreSQL 7.4.6 database. It's
an application with a fairly standard login process verified against
the database.

I'd like to use pg_dump to grab a live backup and, based on the
documentation, this would seem to be a realistic possibility. When I
try, though, during business hours, when people are frequently
logging in and otherwise using the application, the application
becomes almost unusable (to the point where logins take on the order
of minutes).

According to the documentation, pg_dump shouldn't block other
operations on the database other than operations that operate with
exclusive locks. Ordinarily, I run pg_autovacuum on the box, so I
tried again after killing that, thinking that perhaps any substantial
vacuum activity might affect pg_dump. I tried again to no avail.

Excepting the rest of the application, the login process should be
completely read-only and shouldn't require any exclusive locks.

Connections don't really pile up excessively, and load on the machine
does not get in the red zone. Is there anything else I should be
noticing?

-tfo

--
Thomas F. O'Connell
Co-Founder, Information Architect
Sitening, LLC

Strategic Open Source: Open Your i™

http://www.sitening.com/
110 30th Avenue North, Suite 6
Nashville, TN 37203-6320
615-260-0005


Scott Marlowe

2005-05-23, 8:24 pm

On Mon, 2005-05-23 at 14:54, Thomas F. O'Connell wrote:
> I have a web application backed by a PostgreSQL 7.4.6 database. It's
> an application with a fairly standard login process verified against
> the database.
>
> I'd like to use pg_dump to grab a live backup and, based on the
> documentation, this would seem to be a realistic possibility. When I
> try, though, during business hours, when people are frequently logging
> in and otherwise using the application, the application becomes almost
> unusable (to the point where logins take on the order of minutes).
>
> According to the documentation, pg_dump shouldn't block other
> operations on the database other than operations that operate with
> exclusive locks. Ordinarily, I run pg_autovacuum on the box, so I
> tried again after killing that, thinking that perhaps any substantial
> vacuum activity might affect pg_dump. I tried again to no avail.
>
> Excepting the rest of the application, the login process should be
> completely read-only and shouldn't require any exclusive locks.
>
> Connections don't really pile up excessively, and load on the machine
> does not get in the red zone. Is there anything else I should be
> noticing?


Basically, it sounds like postgresql is doing a lot of very long
sequential scans to do this backup. HAve you done a vacuum full
lately? It could be that you've got a lot of table bloat that's making
the seq scans take so long.

You could be I/O saturated already, and the backup is just pushing you
over the edge of the performance knee.

I do a 'vacuum analyze verbose' and see if you need more fsm setup for
your regular vacuums to keep up.

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Matthew T. O'Connor

2005-05-23, 8:24 pm

Thomas F. O'Connell wrote:

> I have a web application backed by a PostgreSQL 7.4.6 database. It's
> an application with a fairly standard login process verified against
> the database.
>
> I'd like to use pg_dump to grab a live backup and, based on the
> documentation, this would seem to be a realistic possibility. When I
> try, though, during business hours, when people are frequently logging
> in and otherwise using the application, the application becomes almost
> unusable (to the point where logins take on the order of minutes).



Could this be an I/O saturation issue like the one the vacuum delay
settings are supposed to help with? Perhaps we could either extend the
vacuum delay settings to effect pg_dump, or make new option to pg_dump
that would have it slow down the dump.

BTW, have you tried running pg_dump from a separate machine? Or even
just making sure that the dump file is being written to a different disk
drive than PostgreSQL is running on. All that disk write activity is
bound to slow the system down.

Matthew


---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql
.org

Martijn van Oosterhout

2005-05-23, 8:24 pm

What's you pg_dump command? Some options may take a lot of memory.

If you list the processes while this is going on, do you see one
chewing all your memory? i.e what's really causing the problem...

Hope this helps,

On Mon, May 23, 2005 at 02:54:46PM -0500, Thomas F. O'Connell wrote:
> I have a web application backed by a PostgreSQL 7.4.6 database. It's
> an application with a fairly standard login process verified against
> the database.
>
> I'd like to use pg_dump to grab a live backup and, based on the
> documentation, this would seem to be a realistic possibility. When I
> try, though, during business hours, when people are frequently
> logging in and otherwise using the application, the application
> becomes almost unusable (to the point where logins take on the order
> of minutes).
>
> According to the documentation, pg_dump shouldn't block other
> operations on the database other than operations that operate with
> exclusive locks. Ordinarily, I run pg_autovacuum on the box, so I
> tried again after killing that, thinking that perhaps any substantial
> vacuum activity might affect pg_dump. I tried again to no avail.
>
> Excepting the rest of the application, the login process should be
> completely read-only and shouldn't require any exclusive locks.
>
> Connections don't really pile up excessively, and load on the machine
> does not get in the red zone. Is there anything else I should be
> noticing?

--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.


Thomas F. O'Connell

2005-05-23, 8:24 pm

Okay, I collated the three replies I got below for ease in replying.

I vacuum full analyze and reindexdb approximately once a month, but I
use pg_autovacuum as a matter of ongoing maintenance, and it seems to
hit equilibrium pretty well and seems to prevent bloat. The last time
I checked a vacuum analyze verbose, I had plenty of FSM to spare. The
data grows, but it doesn't seem to grow so quickly that I'd already
be out of FSM space.

I actually run pg_dump from a remote machine, so I/O contention on
the partition with $PGDATA shouldn't be an issue.

And here is the actual command:

pg_dump -h <host> -F c <database> > <dumpfile>

Pretty basic, although it is compressing.

As far as I can tell, the postmaster handling the dump request takes
up quite a bit of CPU, but not itself to the point where the database
should be unusable under ordinary circumstances. E.g., when a query/
backend eats up that much CPU, it doesn't prevent further access.

I'm suspicious more of something involving locks than of CPU.

Oh, and one other small(ish) detail: the dumping client is using a
7.4.8 installation, whereas the server itself is 7.4.6.

-tfo

--
Thomas F. O'Connell
Co-Founder, Information Architect
Sitening, LLC

Strategic Open Source: Open Your i™

http://www.sitening.com/
110 30th Avenue North, Suite 6
Nashville, TN 37203-6320
615-260-0005

> From: Scott Marlowe < smarlowe@g2switchwor
ks.com>
> Date: May 23, 2005 3:18:33 PM CDT
> To: "Thomas F. O'Connell" <tfo@sitening.com>
> Cc: PgSQL General <pgsql-general@postgresql.org>
> Subject: Re: [GENERAL] pg_dump in a production environment
>
> Basically, it sounds like postgresql is doing a lot of very long
> sequential scans to do this backup. HAve you done a vacuum full
> lately? It could be that you've got a lot of table bloat that's
> making
> the seq scans take so long.
>
> You could be I/O saturated already, and the backup is just pushing you
> over the edge of the performance knee.
>
> I do a 'vacuum analyze verbose' and see if you need more fsm setup
> for
> your regular vacuums to keep up.


> From: "Matthew T. O'Connor" <matthew@zeut.net>
> Date: May 23, 2005 3:18:18 PM CDT
> To: "Thomas F. O'Connell" <tfo@sitening.com>
> Cc: PgSQL General <pgsql-general@postgresql.org>
> Subject: Re: [GENERAL] pg_dump in a production environment
>
> Could this be an I/O saturation issue like the one the vacuum delay
> settings are supposed to help with? Perhaps we could either extend
> the vacuum delay settings to effect pg_dump, or make new option to
> pg_dump that would have it slow down the dump.
>
> BTW, have you tried running pg_dump from a separate machine? Or
> even just making sure that the dump file is being written to a
> different disk drive than PostgreSQL is running on. All that disk
> write activity is bound to slow the system down.
>
> Matthew


> From: Martijn van Oosterhout <kleptog@svana.org>
> Date: May 23, 2005 3:25:23 PM CDT
> To: "Thomas F. O'Connell" <tfo@sitening.com>
> Cc: PgSQL General <pgsql-general@postgresql.org>
> Subject: Re: [GENERAL] pg_dump in a production environment
> Reply-To: Martijn van Oosterhout <kleptog@svana.org>
>
>
> What's you pg_dump command? Some options may take a lot of memory.
>
> If you list the processes while this is going on, do you see one
> chewing all your memory? i.e what's really causing the problem...
>
> Hope this helps,


Thomas F. O'Connell

2005-05-23, 8:24 pm

A note about database design, though: there are thousands of tables
in this database, most of them inherited. I haven't looked at the
internals of pg_dump, but generally, how do the sequential scans
work? Why would these prevent the tables from being accessed by
queries that don't require exclusive locks?

-tfo

--
Thomas F. O'Connell
Co-Founder, Information Architect
Sitening, LLC

Strategic Open Source: Open Your i™

http://www.sitening.com/
110 30th Avenue North, Suite 6
Nashville, TN 37203-6320
615-260-0005

On May 23, 2005, at 3:18 PM, Scott Marlowe wrote:

> Basically, it sounds like postgresql is doing a lot of very long
> sequential scans to do this backup. HAve you done a vacuum full
> lately? It could be that you've got a lot of table bloat that's
> making
> the seq scans take so long.


---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Scott Marlowe

2005-05-23, 8:24 pm

The real problem is that with 7.4's buffering algorithm, the sequential
scans blow the other data out of the internal buffers of postgresql.
And, since a backup needs all the data in the tables, it's gonna seq
scan them anyway. the tables can still be accessed, just the access is
going to be slow because your other processes are fighting the backup
AND nothing in the buffer is likely to be useful to them, except the one
table currently being backed up.

On Mon, 2005-05-23 at 15:58, Thomas F. O'Connell wrote:[color=darkred
]
> A note about database design, though: there are thousands of tables
> in this database, most of them inherited. I haven't looked at the
> internals of pg_dump, but generally, how do the sequential scans
> work? Why would these prevent the tables from being accessed by
> queries that don't require exclusive locks?
>
> -tfo
>
> --
> Thomas F. O'Connell
> Co-Founder, Information Architect
> Sitening, LLC
>
> Strategic Open Source: Open Your iâ„¢
>
> http://www.sitening.com/
> 110 30th Avenue North, Suite 6
> Nashville, TN 37203-6320
> 615-260-0005
>
> On May 23, 2005, at 3:18 PM, Scott Marlowe wrote:
>

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere
" to majordomo@postgresql
.org)

Chris Kratz

2005-05-23, 8:24 pm

Hello Thomas,

We've had and have the exact same issue and have been unable to find a
satisfactory solution to the problem. Currently we "just live with it". We
do periodic backups with pg_dump on an hourly basis. During the dump, other
accesses to the db are incredibly slow making our web app feel somewhat
sluggish for 5 to 10 minutes while the db is dumped.

After a lot of research, it appears to be an i/o and memory contention issue.
Basically, the dump procedure has to pull in all data and tables into memory
to dump them which means any other requests have to pull the data they need
back off of disk (because they got paged out to make room for the dump data)
making them very slow. This is compounded by the fact that pg_dump usually
saturates your I/O throughput. Since postgres doesn't manage the file system
buffers (the os does), there appears to be no easy way to tell it to only use
x amount of memory for the dump leaving all the other memory available for
the running database. I have a hunch that the same thing happens with the
shared buffers, though I haven't proven that. This wasn't a problem for us
while the db fit into ram, but we've grown far past that point now.

The only solution we have ever found is simply to use something like a slony
slave and do dumps from the slave. The slave takes the performance hit and
your primary db keeps on running at full speed. Once the dump is done, then
the slave can "catch up" if it needs to. Unfortunately, I believe there are
issues currently with restoring off of a dump from a slave.

-Chris

On Monday 23 May 2005 04:56 pm, Thomas F. O'Connell wrote:[color=darkred
]
> Okay, I collated the three replies I got below for ease in replying.
>
> I vacuum full analyze and reindexdb approximately once a month, but I
> use pg_autovacuum as a matter of ongoing maintenance, and it seems to
> hit equilibrium pretty well and seems to prevent bloat. The last time
> I checked a vacuum analyze verbose, I had plenty of FSM to spare. The
> data grows, but it doesn't seem to grow so quickly that I'd already
> be out of FSM space.
>
> I actually run pg_dump from a remote machine, so I/O contention on
> the partition with $PGDATA shouldn't be an issue.
>
> And here is the actual command:
>
> pg_dump -h <host> -F c <database> > <dumpfile>
>
> Pretty basic, although it is compressing.
>
> As far as I can tell, the postmaster handling the dump request takes
> up quite a bit of CPU, but not itself to the point where the database
> should be unusable under ordinary circumstances. E.g., when a query/
> backend eats up that much CPU, it doesn't prevent further access.
>
> I'm suspicious more of something involving locks than of CPU.
>
> Oh, and one other small(ish) detail: the dumping client is using a
> 7.4.8 installation, whereas the server itself is 7.4.6.
>
> -tfo
>
> --
> Thomas F. O'Connell
> Co-Founder, Information Architect
> Sitening, LLC
>
> Strategic Open Source: Open Your iâ„¢
>
> http://www.sitening.com/
> 110 30th Avenue North, Suite 6
> Nashville, TN 37203-6320
> 615-260-0005
>

--
Chris Kratz

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Scott Marlowe

2005-05-23, 8:24 pm

On Mon, 2005-05-23 at 16:54, Chris Kratz wrote:
> Hello Thomas,
>
> We've had and have the exact same issue and have been unable to find a
> satisfactory solution to the problem. Currently we "just live with it". We
> do periodic backups with pg_dump on an hourly basis. During the dump, other
> accesses to the db are incredibly slow making our web app feel somewhat
> sluggish for 5 to 10 minutes while the db is dumped.
>
> After a lot of research, it appears to be an i/o and memory contention issue.
> Basically, the dump procedure has to pull in all data and tables into memory
> to dump them which means any other requests have to pull the data they need
> back off of disk (because they got paged out to make room for the dump data)
> making them very slow. This is compounded by the fact that pg_dump usually
> saturates your I/O throughput. Since postgres doesn't manage the file system
> buffers (the os does), there appears to be no easy way to tell it to only use
> x amount of memory for the dump leaving all the other memory available for
> the running database. I have a hunch that the same thing happens with the
> shared buffers, though I haven't proven that. This wasn't a problem for us
> while the db fit into ram, but we've grown far past that point now.


Are you folks running 8.0 with its improved caching algorithms? Just
wondering if that helps or not.

> The only solution we have ever found is simply to use something like a slony
> slave and do dumps from the slave. The slave takes the performance hit and
> your primary db keeps on running at full speed. Once the dump is done, then
> the slave can "catch up" if it needs to. Unfortunately, I believe there are
> issues currently with restoring off of a dump from a slave.


Actually, there's a special dump program somewhere in the slony source
tree, one of the perl scripts. That should fix the issues with the
backups. I ran into it a while back and have to start using the same
file.

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Tom Lane

2005-05-24, 3:23 am

"Thomas F. O'Connell" <tfo@sitening.com> writes:
> I'd like to use pg_dump to grab a live backup and, based on the
> documentation, this would seem to be a realistic possibility. When I
> try, though, during business hours, when people are frequently
> logging in and otherwise using the application, the application
> becomes almost unusable (to the point where logins take on the order
> of minutes).


The pg_dump sources contain some comments about throttling the rate
at which data is pulled from the server, with a statement that this
idea was discussed during July 2000 and eventually dropped. Perhaps
you can think of a better implementation.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Thomas F. O'Connell

2005-05-24, 3:23 am

Actually, I would find this to be an interesting project, but we're
on the verge of moving to 8.0 via Slony and will have a replicated
cluster, reducing the need for live dumps on the primary read/write
database.

It's too bad round tuits are so expensive!

I was trying to think of a way today in which pg_dump might be able
to use statistics in almost the opposite way of pg_autovacuum, such
that it steered clear of objects in heavy use, but I'm not familiar
enough with the source to know how this might work.

-tfo

--
Thomas F. O'Connell
Co-Founder, Information Architect
Sitening, LLC

Strategic Open Source: Open Your i™

http://www.sitening.com/
110 30th Avenue North, Suite 6
Nashville, TN 37203-6320
615-260-0005

On May 23, 2005, at 11:12 PM, Tom Lane wrote:

> "Thomas F. O'Connell" <tfo@sitening.com> writes:
>
>
> The pg_dump sources contain some comments about throttling the rate
> at which data is pulled from the server, with a statement that this
> idea was discussed during July 2000 and eventually dropped. Perhaps
> you can think of a better implementation.
>
> regards, tom lane
>



---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Chris Kratz

2005-05-24, 9:23 am

On Monday 23 May 2005 06:09 pm, Scott Marlowe wrote:
> On Mon, 2005-05-23 at 16:54, Chris Kratz wrote:
> Are you folks running 8.0 with its improved caching algorithms? Just
> wondering if that helps or not.


I should have noted that we are still using 7.4 on our production servers. We
are planning an upgrade to 8.x later this year, so it is very plausible that
the new caching algorithms would help immensely. We are planning on moving
to a backup off of a slony slave at that point as well, so the issue will
probably be moot for us though it would still be nice to be able to do a
backup off of a running production machine.

> Actually, there's a special dump program somewhere in the slony source
> tree, one of the perl scripts. That should fix the issues with the
> backups. I ran into it a while back and have to start using the same
> file.


Interesting, I didn't know that, thanks for the pointer. We will look into
it.

-Chris
--
Chris Kratz

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere
" to majordomo@postgresql
.org)

Sponsored Links





Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive | Programming forum archive

Copyright 2008 droptable.com