Home > Archive > PostgreSQL Performance > March 2006 > Re: Migration study, step 1: bulk write performance









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Re: Migration study, step 1: bulk write performance
Csaba Nagy

2006-03-21, 7:28 am

> Did you try mounting ext3 whith data=writeback by chance? People have
> found that makes a big difference in performance.


I'm not sure, there's other people here doing the OS stuff - I'm pretty
much ignorant about what "data=writeback" could mean :-D

They knew however that for the data partitions no FS journaling is
needed, and for the WAL partition meta data journaling is enough, so I
guess they tuned ext3 for this.

Cheers,
Csaba.



---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql
.org so that your
message can get through to the mailing list cleanly

Steinar H. Gunderson

2006-03-21, 7:28 am

On Tue, Mar 21, 2006 at 06:18:39AM -0600, Jim C. Nasby wrote:
> Basically, you need to know for certain that if PostgreSQL creates a
> file and then fsync's it that that file is safely on disk, and that the
> filesystem knows how to find it (ie: the metadata is also on disk in
> some fashion).


It seems to do, quoting Tom from
http://archives.postgresql.org/pgsq...1/msg00184.php:

== snip ==
No, Mike is right: for WAL you shouldn't need any journaling. This is
because we zero out *and fsync* an entire WAL file before we ever
consider putting live WAL data in it. During live use of a WAL file,
its metadata is not changing. As long as the filesystem follows
the minimal rule of syncing metadata about a file when it fsyncs the
file, all the live WAL files should survive crashes OK.

We can afford to do this mainly because WAL files can normally be
recycled instead of created afresh, so the zero-out overhead doesn't
get paid during normal operation.

You do need metadata journaling for all non-WAL PG files, since we don't
fsync them every time we extend them; which means the filesystem could
lose track of which disk blocks belong to such a file, if it's not
journaled.
== snip ==

/* Steinar */
--
Homepage: http://www.sesse.net/

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Jim C. Nasby

2006-03-21, 7:28 am

On Tue, Mar 21, 2006 at 01:29:54PM +0100, Steinar H. Gunderson wrote:
> On Tue, Mar 21, 2006 at 06:18:39AM -0600, Jim C. Nasby wrote:
>
> It seems to do, quoting Tom from
> http://archives.postgresql.org/pgsq...1/msg00184.php:


404 :(

>
> == snip ==
> its metadata is not changing. As long as the filesystem follows
> the minimal rule of syncing metadata about a file when it fsyncs the
> file, all the live WAL files should survive crashes OK.


And therin lies the rub: file metadata *must* commit to disk as part of
an fsync, and it's needed for both WAL and heap data. It's needed for
heap data because as soon as a checkpoint completes, PostgreSQL is free
to erase any WAL info about previous DDL changes.

On FreeBSD, if you're using softupdates, the filesystem will properly
order writes to the drive so that metadata must be written before file
data; this ensures that an fsync on the file will first write any
metadata before writing the data itself.

With fsync turned off, any metadata-changing commands will wait for the
metadata to commit to disk before returning (unless you run async...)

I'm not really sure how this all plays out on a journalling filesystem.
--
Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Sponsored Links





Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive | Programming forum archive

Copyright 2008 droptable.com