|
Home > Archive > PostgreSQL Administration > October 2006 > Re: postgres in HA constellation
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
Re: postgres in HA constellation
|
|
| Andrew Sullivan 2006-10-25, 8:24 am |
| On Thu, Oct 05, 2006 at 04:24:17AM -0000, Sebastian Reitenbach wrote:
>
> I just have one data center, no remote far away replication is needed.
If it is at all feasible with your budget, I'd think _very strongly_
about replicating using Slony inside your data centre _too_. The
shared storage answer is nice, but it is _really really really_ easy
to shoot yourself in the foot with a rocket propelled grenade with
that arrangement. Very careful administration might prevent it, but
there is a reason that none of the corporate people will guarantee
two machines will never accidentally mount the same file system at
once: in a shared-disc-only system, it's impossible to be 100%
certain that the other machine really is dead and not coming back.
Very tricky scripts could of course lower the risk.
If you're really going to have all that data, it's going to be a
major pain to restore in the event of such corruption. In addition,
your recovery will only be to the last dump. That's why I suggest
replicating, either with Slony or something else, as a belt that will
nicely complement the suspenders of your shared-disc failover.
A
--
Andrew Sullivan | ajs@crankycanuck.ca
I remember when computers were frustrating because they *did* exactly what
you told them to. That actually seems sort of quaint now.
--J.D. Baldwin
---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?
http://www.postgresql.org/docs/faq
| |
| Jim Nasby 2006-10-25, 8:24 am |
| On Oct 5, 2006, at 1:41 PM, Andrew Sullivan wrote:
> On Thu, Oct 05, 2006 at 04:24:17AM -0000, Sebastian Reitenbach wrote:
>
> If it is at all feasible with your budget, I'd think _very strongly_
> about replicating using Slony inside your data centre _too_. The
> shared storage answer is nice, but it is _really really really_ easy
> to shoot yourself in the foot with a rocket propelled grenade with
> that arrangement. Very careful administration might prevent it, but
> there is a reason that none of the corporate people will guarantee
> two machines will never accidentally mount the same file system at
> once: in a shared-disc-only system, it's impossible to be 100%
> certain that the other machine really is dead and not coming back.
> Very tricky scripts could of course lower the risk.
Isn't it entirely possible that if the master gets trashed it would
start sending garbage to the Slony slave as well?
I think PITR would be a much better option to protect against this,
since you could probably recover up to the exact point of failover.
When it comes to the actual failover, take a look at the HA-linux
project. They've got some stuff you could probably use (such as the
heartbeat program). Another really good idea is to give the backup
machine to kill the power to the primary machine, and not have either
machine mount the shared storage at bootup.
If you're interested in paying someone to help setting this up, I
know that we (EnterpriseDB) have folks that have done this before. I
suspect that some of the other folks listed on the commercial support
page have done this as well (likely Command Prompt and Varlena).
--
Jim Nasby jimn@enterprisedb.com
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings
| |
| Andrew Sullivan 2006-10-25, 8:24 am |
| On Thu, Oct 05, 2006 at 08:43:21PM -0500, Jim Nasby wrote:
> Isn't it entirely possible that if the master gets trashed it would
> start sending garbage to the Slony slave as well?
Well, maybe, but unlikely. What happens in a shared-disc failover is
that the second machine re-mounts the same partition as the old
machine had open. The risk is the case where your to-be-removed
machine hasn't actually stopped writing on the partition yet, but
your failover software thinks it's dead, and can fail over. Two
processes have the same Postgres data and WAL files mounted at the
same time, and blammo. As nearly as I can tell, it takes
approximately zero time for this arrangement to make such a mess that
you're not committing any transactions. Slony will only get the data
on COMMIT, so the risk is very small.
> I think PITR would be a much better option to protect against this,
> since you could probably recover up to the exact point of failover.
That oughta work too, except that your remounted WAL gets corrupted
under the imagined scenario, and then you copy the next updates to
the WAL. So you have to save all the incremental copies of the WAL
you make, so that you don't have a garbage file to read.
As I said, I don't think that it's a bad idea to use this sort of
trick. I just think it's a poor single line of defence, because when
it fails, it fails hard.
A
--
Andrew Sullivan | ajs@crankycanuck.ca
In the future this spectacle of the middle classes shocking the avant-
garde will probably become the textbook definition of Postmodernism.
--Brad Holland
---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?
http://archives.postgresql.org
| |
| Jim C. Nasby 2006-10-25, 8:24 am |
| On Wed, Oct 11, 2006 at 10:28:44AM -0400, Andrew Sullivan wrote:
> On Thu, Oct 05, 2006 at 08:43:21PM -0500, Jim Nasby wrote:
>
> Well, maybe, but unlikely. What happens in a shared-disc failover is
> that the second machine re-mounts the same partition as the old
> machine had open. The risk is the case where your to-be-removed
> machine hasn't actually stopped writing on the partition yet, but
> your failover software thinks it's dead, and can fail over. Two
> processes have the same Postgres data and WAL files mounted at the
> same time, and blammo. As nearly as I can tell, it takes
> approximately zero time for this arrangement to make such a mess that
> you're not committing any transactions. Slony will only get the data
> on COMMIT, so the risk is very small.
Hrm... I guess it depends on how quickly the Slony master would stop
processing if it was talking to a shared-disk that had become corrupt
from another postmaster.
>
> That oughta work too, except that your remounted WAL gets corrupted
> under the imagined scenario, and then you copy the next updates to
> the WAL. So you have to save all the incremental copies of the WAL
> you make, so that you don't have a garbage file to read.
>
> As I said, I don't think that it's a bad idea to use this sort of
> trick. I just think it's a poor single line of defence, because when
> it fails, it fails hard.
Yeah, STONITH is *critical* for shared-disk.
--
Jim Nasby jim@nasby.net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster
| |
| Chris Browne 2006-10-25, 8:24 am |
| bnichols@ca.afilias.info (Brad Nicholson) writes:
> On Wed, 2006-10-11 at 16:12 -0500, Jim C. Nasby wrote:
>
> That doesn't depend on Slony, it depends on Postgres. If transactions
> are committing on the master, Slony will replicate them. You could have
> a situation where your HA failover trashes some of you database, but the
> database still starts up. It starts accepting and replicating
> transactions before the corruption is discovered.
There's a bit of "joint responsibility" there.
Let's suppose that the disk has gone bad, zeroing out some index pages
for the Slony-I table sl_log_1. (The situation will be the same for
just about any kind of corruption of a Slony-I internal table.)
There are two possibilities:
1. The PostgreSQL instance may notice that those pages are bad,
returning an error message, and halting the SYNC.
2. The PostgreSQL instance may NOT notice that those pages are bad,
and, as a result, fail to apply some updates, thereby corrupting
the subscriber.
I think there's a pretty high probability of 1) happening rather than
2), but there is a risk of corruption of subscribers roughly
proportional to the probability of 2).
My "gut feel" is that the probability of 2) is pretty small, but I
don't have anything to point to as a proof of that...
--
output = reverse("gro.mca" "@" "enworbbc")
http://www3.sympatico.ca/cbbrowne/
"One of the main causes of the fall of the Roman Empire was that,
lacking zero, they had no way to indicate successful termination of
their C programs." -- Robert Firth
|
|
|
|
|