Home > Archive > Slony1 PostgreSQL Replication > August 2005 > Slony >> Multi Master Replication???









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Slony >> Multi Master Replication???
Andy

2005-07-21, 7:24 am

I know that with Slony I is not possible to make a multi master replication.

Probably some of you have more experience with Postgre Replication and I really need to make a multi master replication. I heard that slony II will be able to do this... but I don't know iof there is any release date or somthing more specific about it.

Or... other replication solution?

Thaks again.
Andy.
Scott Marlowe

2005-07-21, 11:30 am

On Thu, 2005-07-21 at 05:05, Andy wrote:
> I know that with Slony I is not possible to make a multi master
> replication.
>
> Probably some of you have more experience with Postgre Replication and
> I really need to make a multi master replication. I heard that slony
> II will be able to do this... but I don't know iof there is any
> release date or somthing more specific about it.
>
> Or... other replication solution?
>


Well, there's always some kludgy answer to such problems. But, are you
sure you really need true multi-master replication? Mayhaps you could
describe what it is you are trying to do first, then allow the group to
help you come up with an answer. There may be an inobvious fix no one
will think of until they know what objectives you're trying to reach.
Andy

2005-07-22, 3:27 am

> Well, there's always some kludgy answer to such problems. But, are you
> sure you really need true multi-master replication? Mayhaps you could
> describe what it is you are trying to do first, then allow the group to
> help you come up with an answer.

Ok. Here is my story:

We develop a contract management software, we use a Postgre database, which
has around 100 tables. Our compnay wants so open 2 more offices(in
September) in Europe where the employees will have to work with this
management software. And we are looking for a solution that we can install
this software on local servers in this offices.

In this offices will be performed some major tasks like: add a contract,
planifications, report writings etc. We already use Slony for a datatransfer
to a client, but he can only read the data's and it is ok so. But in this
offices will be a lot of writing in the database. That is why I wanted a
multi master replications. Most of the tables are "static" tables. I mean
that they contain lists... that somehow are "fix"... can be user as
"read_only" in these offices. But a task like: add a contract writes data in
at least 8 tables, planning in 3, report writing in 6 or 7 didn't count them
exactly.

So, my problem is this: how to replicate the database and also keep these
tables "writable"??? The solution that I see is to duplicate these tables
and replicate them reversily(from slave to master) and to read the data with
scripts... but somehow I don't like it very much. The other problem with
this solution is that we'll have 2 offices now... what about the future...

I will appriciate any ideeas, solutions etc.

Best regards,
Andy.


----- Original Message -----
From: "Scott Marlowe" <smarlowe- yvqjf0D4ItFqO0yjgLDq
5wC/G2K4zDHf@public.gmane.org>
To: "Andy" <frum- JGHWHXdXz1VeoWH0uzbU
5w@public.gmane.org>
Cc: <slony1-general- AuKwsB3Fm+ugFIWk8tvy
RWD2FQJk+8+b@public.gmane.org>
Sent: Thursday, July 21, 2005 7:04 PM
Subject: Re: [Slony1-general] Slony >> Multi Master Replication???


> On Thu, 2005-07-21 at 05:05, Andy wrote:
>
> Well, there's always some kludgy answer to such problems. But, are you
> sure you really need true multi-master replication? Mayhaps you could
> describe what it is you are trying to do first, then allow the group to
> help you come up with an answer. There may be an inobvious fix no one
> will think of until they know what objectives you're trying to reach.
>
>

Andy

2005-07-25, 11:25 am

Well, everybody's in vacation (I hope so) or there is no solution for me???

Best regards,
Andy.


----- Original Message -----
From: "Andy" <frum- JGHWHXdXz1VeoWH0uzbU
5w@public.gmane.org>
To: "Scott Marlowe" <smarlowe- yvqjf0D4ItFqO0yjgLDq
5wC/G2K4zDHf@public.gmane.org>
Cc: <slony1-general- AuKwsB3Fm+ugFIWk8tvy
RWD2FQJk+8+b@public.gmane.org>
Sent: Friday, July 22, 2005 9:46 AM
Subject: Re: [Slony1-general] Slony >> Multi Master Replication???


> Ok. Here is my story:
>
> We develop a contract management software, we use a Postgre database,
> which has around 100 tables. Our compnay wants so open 2 more offices(in
> September) in Europe where the employees will have to work with this
> management software. And we are looking for a solution that we can install
> this software on local servers in this offices.
>
> In this offices will be performed some major tasks like: add a contract,
> planifications, report writings etc. We already use Slony for a
> datatransfer to a client, but he can only read the data's and it is ok so.
> But in this offices will be a lot of writing in the database. That is why
> I wanted a multi master replications. Most of the tables are "static"
> tables. I mean that they contain lists... that somehow are "fix"... can be
> user as "read_only" in these offices. But a task like: add a contract
> writes data in at least 8 tables, planning in 3, report writing in 6 or 7
> didn't count them exactly.
>
> So, my problem is this: how to replicate the database and also keep these
> tables "writable"??? The solution that I see is to duplicate these tables
> and replicate them reversily(from slave to master) and to read the data
> with scripts... but somehow I don't like it very much. The other problem
> with this solution is that we'll have 2 offices now... what about the
> future...
>
> I will appriciate any ideeas, solutions etc.
>
> Best regards,
> Andy.
>
>
> ----- Original Message -----
> From: "Scott Marlowe" <smarlowe- yvqjf0D4ItFqO0yjgLDq
5wC/G2K4zDHf@public.gmane.org>
> To: "Andy" <frum- JGHWHXdXz1VeoWH0uzbU
5w@public.gmane.org>
> Cc: <slony1-general- AuKwsB3Fm+ugFIWk8tvy
RWD2FQJk+8+b@public.gmane.org>
> Sent: Thursday, July 21, 2005 7:04 PM
> Subject: Re: [Slony1-general] Slony >> Multi Master Replication???
>
>
>
> ____________________
____________________
_______
> Slony1-general mailing list
> Slony1-general- AuKwsB3Fm+ugFIWk8tvy
RWD2FQJk+8+b@public.gmane.org
> http://gborg.postgresql.org/mailman.../slony1-general
>
>

Christopher Kings-Lynne

2005-07-25, 11:25 am

There is no multi master solution for PostgreSQL - at all AFAIK. What
you want is impossible unless you wait no small amount of time for Slony II.

Chris


Andy wrote:
> Well, everybody's in vacation (I hope so) or there is no solution for me???
>
> Best regards,
> Andy.
>
>
> ----- Original Message ----- From: "Andy" <frum- JGHWHXdXz1VeoWH0uzbU
5w@public.gmane.org>
> To: "Scott Marlowe" <smarlowe- yvqjf0D4ItFqO0yjgLDq
5wC/G2K4zDHf@public.gmane.org>
> Cc: <slony1-general- AuKwsB3Fm+ugFIWk8tvy
RWD2FQJk+8+b@public.gmane.org>
> Sent: Friday, July 22, 2005 9:46 AM
> Subject: Re: [Slony1-general] Slony >> Multi Master Replication???
>
>
>
> ____________________
____________________
_______
> Slony1-general mailing list
> Slony1-general- AuKwsB3Fm+ugFIWk8tvy
RWD2FQJk+8+b@public.gmane.org
> http://gborg.postgresql.org/mailman.../slony1-general

Christopher Browne

2005-07-25, 11:25 am

Andy wrote:

> Well, everybody's in vacation (I hope so) or there is no solution for
> me???


The answer is a bit of both.

I've been away, and will be away next week; no doubt others have been on
the run.

I'm not certain that there will be an answer that is entirely suitable
for you...

>From the description you give, it sounds as though you would like to

have the replicas operate somewhat independently, that is, for each site
to use the "local" DB as 'master,' and (maybe sporadically) have the
databases connect to one another to feed one another updates.

- Slony-I was NOT intended as a multimaster replication system, and
there are no plans to take it there.

- Slony-II will be a multimaster replication system, but I'm not sure it
will be particularly suited to your purpose. The operating model
requires that the nodes be pretty tightly coupled; they will not run
well disconnected. Disconnection from the cluster requires that a node
fall into a "read only" mode until it can renegotiate connection(s) and
get resynchronized.

I could imagine someone taking some release of Slony-I and hacking on it
(rather heavily) to create a multimaster system; definitely nontrivial,
and definitely not something I'm starting on today...
Josh Berkus

2005-07-25, 11:25 am

Andy,
[color=darkred]

What it sounds like is that you want asynchronous multi-master replication by
partition ... the SyBase model. Bascially, you need to set things up so
that each office is the "master" of its own data and a "slave" for other
offices' data.

You could do this using table partitioning (inheritance) and Slony-I.
Frankly, though, I'd find a custom code solution to be less kludgy than that:
a set of PL/perl,python, or Java triggers which would track which rows the
local office was master of and prevent updates to other rows.

Regardless, this isn't just something you can retrofit onto an existing
application and expect it to work smoothly. You'll need to do a lot of
back-end redesign around this requirement. That's equally true in Sybase or
MSSQL or Oracle, it's not just Postgres.

--
Josh Berkus
Aglio Database Solutions
San Francisco
Jan Wieck

2005-07-26, 9:24 am

On 7/25/2005 11:21 AM, Christopher Kings-Lynne wrote:

> There is no multi master solution for PostgreSQL - at all AFAIK. What
> you want is impossible unless you wait no small amount of time for Slony II.


I am not even sure that Slony-II will be a solution for Andy. It is
planned as a synchronous multimaster system and that will naturally suck
big times in a WAN setup with network roundtrip times hundreds of times
what you can expect in a gigabit or better LAN.

What would be required here is some asynchronous multimaster with
conflict resolution. I am not aware of any existing solution like that.


Jan

>
> Chris
>
>
> Andy wrote:
> ____________________
____________________
_______
> Slony1-general mailing list
> Slony1-general- AuKwsB3Fm+ugFIWk8tvy
RWD2FQJk+8+b@public.gmane.org
> http://gborg.postgresql.org/mailman.../slony1-general



--
#===================
====================
====================
===========#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#===================
====================
=========== JanWieck- bwPqjjyvM7QAvxtiuMwx
3w@public.gmane.org #
Andy

2005-07-27, 3:26 am

Thank you all for replying.

At least I have a better image about my "problem"... it really becomes a big
one :(.

I looked also over other replication systems, but the problems still remains
open.

Best regards,
Andy.



----- Original Message -----
From: "Jan Wieck" <JanWieck- bwPqjjyvM7QAvxtiuMwx
3w@public.gmane.org>
To: "Christopher Kings-Lynne" <chriskl- GboeX+OVueiPm+gJULTN
R7pzq4S04n8Q@public.gmane.org>
Cc: "Andy" <frum- JGHWHXdXz1VeoWH0uzbU
5w@public.gmane.org>; <slony1-general- AuKwsB3Fm+ugFIWk8tvy
RWD2FQJk+8+b@public.gmane.org>
Sent: Tuesday, July 26, 2005 4:31 PM
Subject: Re: [Slony1-general] Slony >> Multi Master Replication???


> On 7/25/2005 11:21 AM, Christopher Kings-Lynne wrote:
>
>
> I am not even sure that Slony-II will be a solution for Andy. It is
> planned as a synchronous multimaster system and that will naturally suck
> big times in a WAN setup with network roundtrip times hundreds of times
> what you can expect in a gigabit or better LAN.
>
> What would be required here is some asynchronous multimaster with conflict
> resolution. I am not aware of any existing solution like that.
>
>
> Jan
>
>
>
> --
> #===================
====================
====================
===========#
> # It's easier to get forgiveness for being wrong than for being right. #
> # Let's break this rule - forgive me. #
> #===================
====================
=========== JanWieck- bwPqjjyvM7QAvxtiuMwx
3w@public.gmane.org #
>
>

Christopher Browne

2005-07-27, 11:38 am

"Andy" <frum- JGHWHXdXz1VeoWH0uzbU
5w@public.gmane.org> writes:

> Thank you all for replying.
>
> At least I have a better image about my "problem"... it really becomes
> a big one :(.
>
> I looked also over other replication systems, but the problems still
> remains open.


Yes, this indicates that there is room for someone to implement an
asynchronous multimaster replication system with conflict resolution.

There may well be similarities; some of the components of Slony-I
might well be useful in the implementation of such a system, but
that's the level at which things might be "supported."
--
"cbbrowne","@","ca.afilias.info"
<http://dev6.int.libertyrms.com/>
Christopher Browne
(416) 673-4124 (land)
Andreas Pflug

2005-07-27, 11:38 am

Christopher Browne wrote:

>"Andy" <frum- JGHWHXdXz1VeoWH0uzbU
5w@public.gmane.org> writes:
>
>
>
>
>Yes, this indicates that there is room for someone to implement an
>asynchronous multimaster replication system with conflict resolution.
>
>

Better than conflict resolution is conflict avoidance. This is dependent
on the application, and might work if separate number/id spaces for all
objects may be defined. Situations as described will always need adapted
apps and organizational prerequisites. Multimaster replication with
conflict resolution is much of an illusion (... for any database system.
Difference to pgsql is other rdbms' marketing people can lie better :-)

Regards,
Andreas
Christopher Browne

2005-07-27, 11:38 am

Andreas Pflug wrote:

> Christopher Browne wrote:
>
> Better than conflict resolution is conflict avoidance. This is
> dependent on the application, and might work if separate number/id
> spaces for all objects may be defined. Situations as described will
> always need adapted apps and organizational prerequisites. Multimaster
> replication with conflict resolution is much of an illusion (... for
> any database system. Difference to pgsql is other rdbms' marketing
> people can lie better :-)


Well, yes, you certainly want to avoid conflicts altogether if you can.

Implementing some form of "distributed sequence" where each node uses
separate ranges is a reasonable solution to those conflicts that would
arise as a result of "internal interlinkages".

That way, you might, instead of using the default "nextval('my_seq')" to
fill in 'linkage' fields, use the default " dist_nextval('my_dis
t_seq')",
where you have some extra functionality hiding behind the scenes so that
the "distributed sequence" called my_dist_seq assigns different value
ranges on one host as compared to another.

Based on what I have heard, I think Slony-II could benefit from that;
the idea is hardly confined to being useful in one place.

However, there will be conflicts that you cannot avoid.

For instance, there is the conflict that someone on node #1 draws down
inventory of product ABC by 20 units, whilst, concurrently, someone on
node #2 draws down inventory on the same product by 30 units, when the
inventory started at 40 units, and there is a rule that inventory cannot
fall below zero. Separate "sequence spaces" does nothing to help with that.

Essentially equivalent is the conflict that someone on node #1 updates
the phone number of customer XYZ whilst concurrently someone else on
node 2 does a different update to the phone number for the same customer.

Those are two examples of application-driven conflicts.

Some clever modifications to sequence handling may allow you to avoid
having the replication system itself introduce conflicts, which is well
and good; it does nothing to resolve the sorts of "application-driven"
conflicts described above.

If I understand things correctly, one of Oracle's multimaster
replication systems will queue up updates and stop the queue when it
finds conflicts. The DBA then has to resolve the conflicts. (They may
have some way to automate parts of it; I'm not sure.)

SAP R/3 had an interesting way of handling this sort of thing; you could
group its "BDC" updates together and try to run a batch of transactions
through. Those transactions that failed would be left; you could try to
walk them through manually, step by step. That was pretty neat as it
would allow someone with application knowledge to process errors rather
than forcing it on the "techies" who might not be familiar with the
business transactions involved. But that's a lightyear away from being
something we could add to a replication system...
Christopher Kings-Lynne

2005-07-28, 3:24 am

> Implementing some form of "distributed sequence" where each node uses
> separate ranges is a reasonable solution to those conflicts that would
> arise as a result of "internal interlinkages".


All you do is make one db use even numbers and the other use odd by
setting start and step values of the sequence.

Chris
cbbrowne-swQf4SbcV9C7WVzo/KQ3Mw@public.gmane.org

2005-07-28, 3:24 am

>> Implementing some form of "distributed sequence" where each node uses
>
> All you do is make one db use even numbers and the other use odd by
> setting start and step values of the sequence.


How, in that scenario, does one introduce a fifth "master node"?

If you have one office that is an asynchronous node, that could easily be
extended to needing five offices. Or 37 remote salescritters, each with =
a
database on their laptop...

It's not too difficult to solve the "distributed sequence" for the case o=
f
two nodes, as even/odd does work out quite nicely. Generalizing is not s=
o
easy...
Magnus Hagander

2005-07-28, 3:24 am

> >> Implementing some form of "distributed sequence" where=20
> each node uses=20
>=20
> How, in that scenario, does one introduce a fifth "master node"?
>=20
> If you have one office that is an asynchronous node, that=20
> could easily be extended to needing five offices. Or 37=20
> remote salescritters, each with a database on their laptop...
>=20
> It's not too difficult to solve the "distributed sequence"=20
> for the case of two nodes, as even/odd does work out quite=20
> nicely. Generalizing is not so easy...


Make it a compound key that has one part being the node id and one part
being a local sequence counter?

//Magnus
Jan Wieck

2005-07-28, 9:25 am

On 7/28/2005 12:06 AM, cbbrowne-swQf4SbcV9C7WVzo/KQ3Mw@public.gmane.org wrote:
>
> How, in that scenario, does one introduce a fifth "master node"?


By having the sequence increment in 10 or 100 steps. This has even the
side effect that you can tell by the last 1 or 2 digits which node did
allocate the ID.

>
> If you have one office that is an asynchronous node, that could easily be
> extended to needing five offices. Or 37 remote salescritters, each with a
> database on their laptop...
>
> It's not too difficult to solve the "distributed sequence" for the case of
> two nodes, as even/odd does work out quite nicely. Generalizing is not so
> easy...


Well, with this method you have to pick some upper limit, which is then
cemented into the increment step.

Another method would be to modify sequences as follows. When a sequence
reaches maxval on a nextval() call, it looks up a sequence number range
table. If there is an unused "block" assigned to this sequence, the
sequences current and max values are adjusted to the block boundary and
the block marked used. WARNING: there is a problem here if the
transaction later aborts!!!

It will become the DBA's duty to monitor which nodes will run out of
their number range and put allocations into place.


Jan

--
#===================
====================
====================
===========#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#===================
====================
=========== JanWieck- bwPqjjyvM7QAvxtiuMwx
3w@public.gmane.org #
Jan Wieck

2005-07-28, 9:25 am

On 7/28/2005 3:45 AM, Magnus Hagander wrote:

>
> Make it a compound key that has one part being the node id and one part
> being a local sequence counter?


That allways works.


Jan

--
#===================
====================
====================
===========#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#===================
====================
=========== JanWieck- bwPqjjyvM7QAvxtiuMwx
3w@public.gmane.org #
Andrew Sullivan

2005-08-10, 11:26 am

On Wed, Jul 27, 2005 at 11:13:55AM -0400, Christopher Browne wrote:
>
> Yes, this indicates that there is room for someone to implement an
> asynchronous multimaster replication system with conflict resolution.


This was a long time coming as a followup, but Josh Berkus sketched
the kernel of a way to do this using Slony for me while we were at
OSCON. I dunno if it'll work, but basically the idea is to expand
the key that the trigger needs to include a serverid. You'd need
some management tools for this.

He told me that Sybase's answer to conflict resolution in these cases
is that the first one in wins, period. I don't know I'm comfortable
with that.

A

--
Andrew Sullivan | ajs-oaT0K0jot5/q2IAV+ODieA@public.gmane.org
This work was visionary and imaginative, and goes to show that visionary
and imaginative work need not end up well.
--Dennis Ritchie
Jan Wieck

2005-08-10, 11:26 am

On 8/10/2005 10:39 AM, Andrew Sullivan wrote:
> On Wed, Jul 27, 2005 at 11:13:55AM -0400, Christopher Browne wrote:
>
> This was a long time coming as a followup, but Josh Berkus sketched
> the kernel of a way to do this using Slony for me while we were at
> OSCON. I dunno if it'll work, but basically the idea is to expand
> the key that the trigger needs to include a serverid. You'd need
> some management tools for this.
>
> He told me that Sybase's answer to conflict resolution in these cases
> is that the first one in wins, period. I don't know I'm comfortable
> with that.


First one wins? I thought last update wins is the default for that sort.

Both are insufficient for resolving conflicts on balance type columns
(like an account balance). What would be necessary is the distinction
between balance and overwrite type columns, plus a timestamp on all
replicated tables (either in the table itself or a helper status table).
If an update to a row happens, it can generate two replication
recordings. One for the overwrite type columns, that just propagates the
new values. The other for balance type columns, that propagates delta
values instead. All balance type recordings are applied on all peers no
matter what their timestamp, plus they don't affect the timestamping on
the rows. The overwrite recordings are only applied if their timestamp
is later than the timestamp on the row.

There are more complicated race conditions. Think of a 3 node setup.
Node A does an INSERT. Node B replicates that and then does an UPDATE to
that row. How do you guarantee that node C will replicate the INSERT
before replicating the UPDATE?


Jan

--
#===================
====================
====================
===========#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#===================
====================
=========== JanWieck- bwPqjjyvM7QAvxtiuMwx
3w@public.gmane.org #
Andrew Sullivan

2005-08-10, 8:25 pm

On Wed, Aug 10, 2005 at 11:02:32AM -0400, Jan Wieck wrote:
>
> First one wins? I thought last update wins is the default for that sort.


I'm just reporting what I heard.

> Both are insufficient for resolving conflicts on balance type columns
> (like an account balance). What would be necessary is the distinction


No guff ;-)

> There are more complicated race conditions. Think of a 3 node setup.
> Node A does an INSERT. Node B replicates that and then does an UPDATE to
> that row. How do you guarantee that node C will replicate the INSERT
> before replicating the UPDATE?


You got me. This is why I hate such systems -- they're too hard to
understand, and I always feel like something's going to get
undetectably lost.

A

--
Andrew Sullivan | ajs-oaT0K0jot5/q2IAV+ODieA@public.gmane.org
In the future this spectacle of the middle classes shocking the avant-
garde will probably become the textbook definition of Postmodernism.
--Brad Holland
Sponsored Links





Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive | Programming forum archive

Copyright 2008 droptable.com