Home > Archive > Slony1 PostgreSQL Replication > August 2005 > Replicating an existing large database









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Replicating an existing large database
Ujwal S. Setlur

2005-08-10, 7:25 am

Hello,

I would like to set up slony1 (1.1.0) to replicate an
existing database that has one large table (~30
million rows) and several other smaller tables.

I went throught the steps of setting up a fresh slave
and subscribing to the master. However, when I started
replication, access to tables (using select statemtns)
on the slave are locking up. I suspect it is due to
the large amount of data being transferred.

What is the best way to start replication of an
existing large database?

1. Currently all the tables are in one set. I can try
to move the large table into a different set, but I
don't want to lock the application out of inserting
new rows into the table on the master.

2. Can I take a snapshot of the master using pg_dump,
restore it on the slave and then start replication?

Thanks,

Ujwal
Brad Nicholson

2005-08-10, 7:25 am

Ujwal S. Setlur wrote:

>Hello,
>
>I would like to set up slony1 (1.1.0) to replicate an
>existing database that has one large table (~30
>million rows) and several other smaller tables.
>
>I went throught the steps of setting up a fresh slave
>and subscribing to the master. However, when I started
>replication, access to tables (using select statemtns)
>on the slave are locking up. I suspect it is due to
>the large amount of data being transferred.
>
>What is the best way to start replication of an
>existing large database?
>
>1. Currently all the tables are in one set. I can try
>to move the large table into a different set, but I
>don't want to lock the application out of inserting
>new rows into the table on the master.
>
>
>

The subscribe will sub all the tables from one set in one go. If the
other tables are small, there is likely to be nothing gained from
putting them into another set.

BTW, you could do this without locking data writes out of the master.
Use the slonik command 'set drop table' to drop the big table from your
existing set, then create a new set containing the big table.

>2. Can I take a snapshot of the master using pg_dump,
>restore it on the slave and then start replication?
>
>
>

No, this won't work. When you issue subscribe set, slonik will delete
all the entries from the tables in that set (on the subscriber) and
repopulate them.

Some things that might help ease the burdon of the subscription (do
before issuing the subscribe):
-truncate all the tables on the new subscriber, and vacuum them.
-dropping the indexes on the big table on the subscriber will speed up
the copy. You'll have to recreate them manually after the subscription
is done though. There is a patch to do this automatically being worked
on, but it's not quite ready for prime time.
-turning off fsync on the subscriber might help as well (I've never
tried, so I can't say). _Don't_ do this if you have any other databases
running on that backend that are important! This may leave you with an
unrecoverable mess if the backend crashes. Be sure to turn it back on
once the subscription is complete.

I would also strongly recommend locking access on the subscriber to
everyone execpt the user that is doing the subscription. An easy way to
do this is to setup dedicated slony user, and lock out all other users
via the pg_hba.conf.

I'd expect this table to take quite a while to subscribe, but the amount
of time is going to depend largely on the hardware you are running.

--
Brad Nicholson 416-673-4106
Database Administrator, Afilias Canada Corp.
Ujwal S. Setlur

2005-08-10, 7:25 am



--- Brad Nicholson <bnichols-swQf4SbcV9C7WVzo/KQ3Mw@public.gmane.org> wrote:

> Ujwal S. Setlur wrote:
>=20
> an
> slave
> started
> statemtns)
> try
> The subscribe will sub all the tables from one set
> in one go. If the=20
> other tables are small, there is likely to be
> nothing gained from=20
> putting them into another set.


That is what I thought.

>=20
> BTW, you could do this without locking data writes
> out of the master. =20
> Use the slonik command 'set drop table' to drop the
> big table from your=20
> existing set, then create a new set containing the
> big table.
>=20


Can you elaborate on that a little? The big table is a
data table that gets written to quite frequently. Can
I avoid locking data writes to that table on the
master during subscription?

Thanks,

Ujwal=20
Brad Nicholson

2005-08-10, 7:25 am

Ujwal S. Setlur wrote:

>--- Brad Nicholson <bnichols-swQf4SbcV9C7WVzo/KQ3Mw@public.gmane.org> wrote:
>
>
>
>
>That is what I thought.
>
>
>
>
>Can you elaborate on that a little? The big table is a
>data table that gets written to quite frequently. Can
>I avoid locking data writes to that table on the
>master during subscription?
>
>
>


The statement above was about how you could easily move the big table
into its own set, if you wanted to.

Subscribing a set will not lock the tables on the master (or the
subscriber - but you will not see the data on the subscriber until the
subscription is complete). You will place additional load on that db
while doing the subscribe, but at no point during a subscription will
writes or reads to/from the master database be blocked.


--
Brad Nicholson 416-673-4106
Database Administrator, Afilias Canada Corp.
Christopher Browne

2005-08-10, 7:25 am

"Ujwal S. Setlur" <uvsetlur-/ E1597aS9LQAvxtiuMwx3
w@public.gmane.org> writes:
> I would like to set up slony1 (1.1.0) to replicate an
> existing database that has one large table (~30
> million rows) and several other smaller tables.
>
> I went throught the steps of setting up a fresh slave
> and subscribing to the master. However, when I started
> replication, access to tables (using select statemtns)
> on the slave are locking up. I suspect it is due to
> the large amount of data being transferred.


There is no surprise here.

The tables have exclusive locks on them on the subscriber while the
COPY_SET event takes place.

It is fair to say that the subscriber is not in a usable state while
the subscription is setting up.

> What is the best way to start replication of an
> existing large database?
>
> 1. Currently all the tables are in one set. I can try
> to move the large table into a different set, but I
> don't want to lock the application out of inserting
> new rows into the table on the master.


Nothing is locked on the master, except insofar as you have, during
the COPY_SET event, some AccessShareLock locks.

For instance, see the locks outstanding during a COPY_SET for one of
the ducttape tests:

/* cbbrowne@[local]/dba2 slony_test1=*/ select c.relname, l.transaction, l.pid, l.mode, l.granted from pg_locks l, pg_class c where c.oid = l.relation and relname in ('tellers', 'accounts', 'branches', 'history') order by c.relname, l.mode;
relname | transaction | pid | mode | granted
----------+-------------+-------+------------------+---------
accounts | | 26149 | AccessShareLock | t
accounts | | 26146 | AccessShareLock | t
accounts | | 26147 | AccessShareLock | t
accounts | | 26150 | AccessShareLock | t
accounts | | 26149 | RowExclusiveLock | t
accounts | | 26146 | RowExclusiveLock | t
accounts | | 26147 | RowExclusiveLock | t
accounts | | 26150 | RowExclusiveLock | t
branches | | 26149 | AccessShareLock | t
branches | | 26150 | AccessShareLock | t
branches | | 26147 | AccessShareLock | t
branches | | 26146 | AccessShareLock | t
branches | | 26149 | RowExclusiveLock | t
branches | | 26150 | RowExclusiveLock | t
branches | | 26147 | RowExclusiveLock | t
branches | | 26146 | RowExclusiveLock | t
tellers | | 26147 | AccessShareLock | t
tellers | | 26149 | AccessShareLock | t
tellers | | 26150 | AccessShareLock | t
tellers | | 26146 | AccessShareLock | t
tellers | | 26147 | RowExclusiveLock | t
tellers | | 26149 | RowExclusiveLock | t
tellers | | 26150 | RowExclusiveLock | t
tellers | | 26146 | RowExclusiveLock | t
(24 rows)

> 2. Can I take a snapshot of the master using pg_dump, restore it on
> the slave and then start replication?


No, that is not supported, and won't likely be in future.

At OSCON, there was discussion of a "CLONE NODE" notion, where raw
filesystem copies would be taken of nodes (likely copied via rsync)
one of two things would happen:

1. Duplicate a master node

In this case, the master node would have to be locked (all sets
having the slonik LOCK SET operation applied to them) during the
"clone time."

Notion:

Repeatedly rsync the master DB to a new host, until the updates
take little time to process.

Then LOCK SET, shut the master DB down, and do a final rsync.

Start both DBs

Submit a new "CLONE NODE" command, probably of the form...

clone node (source node=1, new node=4, conninfo='dbname=myd
b host=server2 port=5432');

This command verifies that both nodes are locked, then...
1. Submits a STORE_NODE to add in new node #4
2. Submits a STORE_PATH to add a path from 1 to 4
3. Generates a set of sl_subscribe entries for all of the sets on 1, where 4 subscribes to 1
4. Renumbers things on node 4 so that it believes itself to be node 4
5. Unlocks the master node

Now start 2 slons...

Alas, this requires LOCK SET and some outage of the master DB, so it
definitely does NOT fit into the category of "does not interrupt
operations." :-(

2. Duplicate a subscriber node

This is analagous; hits a subscriber node. Things are quite a bit
simpler to handle, and we don't have to interrupt operations.

Notion is similar:

Repeatedly rsync subscriber to a new host until the updates take
little time to process.

Then stop the slon for the subscriber, call it node #2

Do a final rsync

Start both DBs

Submit a new "CLONE NODE" command, of the form

clone node (source node=2, new node=4, conninfo='dbname=myd
b host=server2 port=5432');

This command verifies that both nodes are in sync, then...
1. Submits a STORE_NODE to add in new node #4
2. Submits a STORE_PATH to add a path from 2 to 4
3. Generates a set of sl_subscribe entries for all of the sets on 2, where 4 subscribes to 2
4. Renumbers things on node 4 so that it believes itself to be node 4

Now start 2 slons...

Note: This is a brand new plan, and CLONE NODE has not yet had a
single line of code written for it.

If people have rocks to throw at this, it would be nice to hear about
before implementing code :-).
--
"cbbrowne","@","ca.afilias.info"
<http://dev6.int.libertyrms.com/>
Christopher Browne
(416) 673-4124 (land)
Ujwal S. Setlur

2005-08-10, 7:25 am

Thanks for the reply. For some reason, I thought write
access to the master was locked during subscription.
Thanks for setting me straight. I can deal with the
slave not being very usable until subscription is
done.

Thanks again,

Ujwal

--- Christopher Browne <cbbrowne-swQf4SbcV9C7WVzo/KQ3Mw@public.gmane.org>
wrote:

> "Ujwal S. Setlur" <uvsetlur-/ E1597aS9LQAvxtiuMwx3
w@public.gmane.org> writes:
> an
> slave
> started
> statemtns)
> to
>=20
> There is no surprise here.
>=20
> The tables have exclusive locks on them on the
> subscriber while the
> COPY_SET event takes place.
>=20
> It is fair to say that the subscriber is not in a
> usable state while
> the subscription is setting up.
>=20
> try
> I
> inserting
>=20
> Nothing is locked on the master, except insofar as
> you have, during
> the COPY_SET event, some AccessShareLock locks.
>=20
> For instance, see the locks outstanding during a
> COPY_SET for one of
> the ducttape tests:
>=20
> /* cbbrowne@[local]/dba2 slony_test1=3D*/ select
> c.relname, l.transaction, l.pid, l.mode, l.granted
> from pg_locks l, pg_class c where c.oid =3D l.relation
> and relname in ('tellers', 'accounts', 'branches',
> 'history') order by c.relname, l.mode;
> relname | transaction | pid | mode |
> granted=20
>

----------+-------------+-------+------------------+---------
> accounts | | 26149 | AccessShareLock |
> t
> accounts | | 26146 | AccessShareLock |
> t
> accounts | | 26147 | AccessShareLock |
> t
> accounts | | 26150 | AccessShareLock |
> t
> accounts | | 26149 | RowExclusiveLock |
> t
> accounts | | 26146 | RowExclusiveLock |
> t
> accounts | | 26147 | RowExclusiveLock |
> t
> accounts | | 26150 | RowExclusiveLock |
> t
> branches | | 26149 | AccessShareLock |
> t
> branches | | 26150 | AccessShareLock |
> t
> branches | | 26147 | AccessShareLock |
> t
> branches | | 26146 | AccessShareLock |
> t
> branches | | 26149 | RowExclusiveLock |
> t
> branches | | 26150 | RowExclusiveLock |
> t
> branches | | 26147 | RowExclusiveLock |
> t
> branches | | 26146 | RowExclusiveLock |
> t
> tellers | | 26147 | AccessShareLock |
> t
> tellers | | 26149 | AccessShareLock |
> t
> tellers | | 26150 | AccessShareLock |
> t
> tellers | | 26146 | AccessShareLock |
> t
> tellers | | 26147 | RowExclusiveLock |
> t
> tellers | | 26149 | RowExclusiveLock |
> t
> tellers | | 26150 | RowExclusiveLock |
> t
> tellers | | 26146 | RowExclusiveLock |
> t
> (24 rows)
>=20
> pg_dump, restore it on
>=20
> No, that is not supported, and won't likely be in
> future.
>=20
> At OSCON, there was discussion of a "CLONE NODE"
> notion, where raw
> filesystem copies would be taken of nodes (likely
> copied via rsync)
> one of two things would happen:
>=20
> 1. Duplicate a master node
>=20
> In this case, the master node would have to be
> locked (all sets
> having the slonik LOCK SET operation applied to
> them) during the
> "clone time."
>=20
> Notion:
>=20
> Repeatedly rsync the master DB to a new host,
> until the updates
> take little time to process.
>=20
> Then LOCK SET, shut the master DB down, and do a
> final rsync.
>=20
> Start both DBs
>=20
> Submit a new "CLONE NODE" command, probably of
> the form...
>=20
> clone node (source node=3D1, new node=3D4,
> conninfo=3D'dbname=3
Dmydb host=3Dserver2 port=3D5432');
>=20
> This command verifies that both nodes are locked,
> then...
> 1. Submits a STORE_NODE to add in new node #4
> 2. Submits a STORE_PATH to add a path from 1 to
> 4
> 3. Generates a set of sl_subscribe entries for
> all of the sets on 1, where 4 subscribes to 1
> 4. Renumbers things on node 4 so that it
> believes itself to be node 4
> 5. Unlocks the master node
>=20
> Now start 2 slons...
>=20
> Alas, this requires LOCK SET and some outage of the
> master DB, so it
> definitely does NOT fit into the category of "does
> not interrupt
> operations." :-(
>=20
> 2. Duplicate a subscriber node
>=20
> This is analagous; hits a subscriber node. Things
> are quite a bit
> simpler to handle, and we don't have to interrupt
> operations.
>=20
> Notion is similar:
>=20
> Repeatedly rsync subscriber to a new host until
> the updates take
> little time to process.
>=20
> Then stop the slon for the subscriber, call it
> node #2
>=20
> Do a final rsync
>=20
> Start both DBs
>=20
> Submit a new "CLONE NODE" command, of the form
>=20
> clone node (source node=3D2, new node=3D4,
> conninfo=3D'dbname=3
Dmydb host=3Dserver2 port=3D5432');
>=20
> This command verifies that both nodes are in
> sync, then...
> 1. Submits a STORE_NODE to add in new node #4
> 2. Submits a STORE_PATH to add a path from 2 to
> 4
> 3. Generates a set of sl_subscribe entries for
> all of the sets on 2, where 4 subscribes to 2
> 4. Renumbers things on node 4 so that it
> believes itself to be node 4
>=20
> Now start 2 slons... =20
>=20
> Note: This is a brand new plan, and CLONE NODE has
> not yet had a
> single line of code written for it.
>=20
>=20

=3D=3D=3D message truncated =3D=3D=3D
Ujwal S. Setlur

2005-08-10, 7:25 am



--- Brad Nicholson <bnichols-swQf4SbcV9C7WVzo/KQ3Mw@public.gmane.org> wrote:

>
> The statement above was about how you could easily
> move the big table=20
> into its own set, if you wanted to.
>=20
> Subscribing a set will not lock the tables on the
> master (or the=20
> subscriber - but you will not see the data on the
> subscriber until the=20
> subscription is complete). You will place
> additional load on that db=20
> while doing the subscribe, but at no point during a
> subscription will=20
> writes or reads to/from the master database be
> blocked.
>=20
>=20


Thanks for clearing it up for me. In this case, it is
OK for the slave to take a while to catch up as long
as the master is usable. I just hope my app does not
have to fail over during the subscription :-)

Ujwal
Sponsored Links





Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive | Programming forum archive

Copyright 2008 droptable.com