|
Home > Archive > PostgreSQL Bugs > November 2005 > Autovacuum deadlock - bug or not?
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
Autovacuum deadlock - bug or not?
|
|
| Mikael Carneholm 2005-11-17, 9:24 am |
| Don't know if this is a bug or just undocumented, but it seems as you should turn off autovacuum before you run CLUSTER, otherwise you might run into a deadlock:
NOTICE: Clustering idx_vehicle_unit_dat
a_200407_person_info
rmation__id on vehicle_unit_data_20
0407
ERROR: deadlock detected
DETAIL: Process 29022 waits for AccessExclusiveLock on relation 68950 of database 16390; blocked by process 15865.
Process 15865 waits for AccessShareLock on relation 68122 of database 16390; blocked by process 29022.
/Mikael
---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql
.org so that your
message can get through to the mailing list cleanly
| |
| Tom Lane 2005-11-17, 9:24 am |
| "Mikael Carneholm" <Mikael. Carneholm@WirelessCa
r.com> writes:
> Don't know if this is a bug or just undocumented, but it seems as you should turn off autovacuum before you run CLUSTER, otherwise you might run into a deadlock:
Which variant of CLUSTER were you using? ISTR that some of them lock
the index before the table, which is prone to deadlock against nearly
all other operations on the table (not just vacuum).
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match
| |
| Mikael Carneholm 2005-11-17, 9:24 am |
| variant: CLUSTER indexname ON tablename
Maybe there should be something about this in the docs, so that users don't get surprised when this happens and start sending stupid emails to the pgsql-bugs list :)
/Mikael
-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: den 17 november 2005 15:51
To: Mikael Carneholm
Cc: pgsql-bugs@postgresql.org
Subject: Re: [BUGS] Autovacuum deadlock - bug or not?
"Mikael Carneholm" <Mikael. Carneholm@WirelessCa
r.com> writes:
> Don't know if this is a bug or just undocumented, but it seems as you should turn off autovacuum before you run CLUSTER, otherwise you might run into a deadlock:
Which variant of CLUSTER were you using? ISTR that some of them lock
the index before the table, which is prone to deadlock against nearly
all other operations on the table (not just vacuum).
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings
| |
| Tom Lane 2005-11-17, 9:24 am |
| "Mikael Carneholm" <Mikael. Carneholm@WirelessCa
r.com> writes:
> variant: CLUSTER indexname ON tablename
Hmph. Looking at the code, that should always lock the table first,
so I don't see where the problem is. Would you look up the numbers
for us --- exactly which relations were involved in the deadlock,
and (if you can tell) which process was which?
Also, what PG version is this exactly?
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql
.org so that your
message can get through to the mailing list cleanly
| |
| Mikael Carneholm 2005-11-17, 9:24 am |
| Unfortunately, relfilenodes 68950 and 68122 don't exist anymore, could be that they were temporary copies of tables or indexes. I know that process 15865 was the autovacuum pid, I looked that up when it happened (pg was restarted with autovacuum=off after
wards, so that process is also gone)
Could it be that the deadlock was caused by autovacuum trying to vacuum one of the temp copies?
-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: den 17 november 2005 16:04
To: Mikael Carneholm
Cc: pgsql-bugs@postgresql.org
Subject: Re: [BUGS] Autovacuum deadlock - bug or not?
"Mikael Carneholm" <Mikael. Carneholm@WirelessCa
r.com> writes:
> variant: CLUSTER indexname ON tablename
Hmph. Looking at the code, that should always lock the table first,
so I don't see where the problem is. Would you look up the numbers
for us --- exactly which relations were involved in the deadlock,
and (if you can tell) which process was which?
Also, what PG version is this exactly?
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings
| |
| Mikael Carneholm 2005-11-17, 9:24 am |
| Forgot to mention:
dfol=> select version();
version
-------------------------------------------------------------
PostgreSQL 8.1.0 on i686-pc-linux-gnu, compiled by GCC 2.96
(1 row)
-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: den 17 november 2005 16:04
To: Mikael Carneholm
Cc: pgsql-bugs@postgresql.org
Subject: Re: [BUGS] Autovacuum deadlock - bug or not?
"Mikael Carneholm" <Mikael. Carneholm@WirelessCa
r.com> writes:
> variant: CLUSTER indexname ON tablename
Hmph. Looking at the code, that should always lock the table first,
so I don't see where the problem is. Would you look up the numbers
for us --- exactly which relations were involved in the deadlock,
and (if you can tell) which process was which?
Also, what PG version is this exactly?
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings
| |
| Tom Lane 2005-11-17, 11:24 am |
| "Mikael Carneholm" <Mikael. Carneholm@WirelessCa
r.com> writes:
> Unfortunately, relfilenodes 68950 and 68122 don't exist anymore,
You should be looking at pg_class.oid, not relfilenode.
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster
| |
| Mikael Carneholm 2005-11-17, 11:24 am |
| oops, you're right.
dfol=> select pgc.oid, pgc.relname from pg_class pgc where pgc.oid in (68950, 68122);
oid | relname
-------+--------------------------
68950 | vehicle_unit_data_20
0407
68122 | vehicle_unit_data_20
0301
NOTICE: Clustering idx_vehicle_unit_dat
a_200407_person_info
rmation__id on vehicle_unit_data_20
0407
ERROR: deadlock detected
DETAIL: Process 29022 waits for AccessExclusiveLock on relation 68950 of database 16390; blocked by process 15865.
Process 15865 waits for AccessShareLock on relation 68122 of database 16390; blocked by process 29022.
So it seems that it was the clustering of idx_vehicle_unit_dat
a_200407_person_info
rmation__id on vehicle_unit_data_20
0407 that caused the deadlock.
-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: den 17 november 2005 16:53
To: Mikael Carneholm
Cc: pgsql-bugs@postgresql.org
Subject: Re: [BUGS] Autovacuum deadlock - bug or not?
"Mikael Carneholm" <Mikael. Carneholm@WirelessCa
r.com> writes:
> Unfortunately, relfilenodes 68950 and 68122 don't exist anymore,
You should be looking at pg_class.oid, not relfilenode.
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings
| |
| Tom Lane 2005-11-17, 11:24 am |
| "Mikael Carneholm" <Mikael. Carneholm@WirelessCa
r.com> writes:
> dfol=> select pgc.oid, pgc.relname from pg_class pgc where pgc.oid in (68950, 68122);
> oid | relname
> -------+--------------------------
> 68950 | vehicle_unit_data_20
0407
> 68122 | vehicle_unit_data_20
0301
> NOTICE: Clustering idx_vehicle_unit_dat
a_200407_person_info
rmation__id on vehicle_unit_data_20
0407
> ERROR: deadlock detected
> DETAIL: Process 29022 waits for AccessExclusiveLock on relation 68950 of database 16390; blocked by process 15865.
> Process 15865 waits for AccessShareLock on relation 68122 of database 16390; blocked by process 29022.
> So it seems that it was the clustering of idx_vehicle_unit_dat
a_200407_person_info
rmation__id on vehicle_unit_data_20
0407 that caused the deadlock.
Hmm, the CLUSTER on vehicle_unit_data_20
0407 wouldn't have taken any
lock on vehicle_unit_data_20
0301. Were you perhaps issuing a series
of CLUSTERs inside a transaction block? That would pile up exclusive
locks on all the tables involved, which is certainly deadlock-prone.
I'm also wondering where that NOTICE "Clustering ..." came from, because
there is no such message anywhere in the 8.1 PG sources. You *sure*
this is 8.1?
There's something funny about 15865 too; you said that was an autovacuum
process but I don't think so. VACUUM doesn't take AccessShareLock;
there's a different lock type that that tries to acquire. And it
doesn't take any locks at all on more than one user table at a time.
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match
| |
| Mikael Carneholm 2005-11-17, 11:24 am |
| >Hmm, the CLUSTER on vehicle_unit_data_20
0407 wouldn't have taken any
>lock on vehicle_unit_data_20
0301. Were you perhaps issuing a series
>of CLUSTERs inside a transaction block? That would pile up exclusive
>locks on all the tables involved, which is certainly deadlock-prone.
Ah, that explains it...the clustering is done from a function that's clustering all (child) tables of a parent, and as I've come to understand, all functions have a built-in transaction (the "Clustering idx_foo on bar" notice message also originates from
that function)
I'm pretty sure though that 15865 was the autovacuum process, but I'll have to run the function with autovacuum turned on before I can verify that this is (was) the case.
/Mikael
-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: den 17 november 2005 17:42
To: Mikael Carneholm
Cc: pgsql-bugs@postgresql.org
Subject: Re: [BUGS] Autovacuum deadlock - bug or not?
"Mikael Carneholm" <Mikael. Carneholm@WirelessCa
r.com> writes:
> dfol=> select pgc.oid, pgc.relname from pg_class pgc where pgc.oid in (68950, 68122);
> oid | relname
> -------+--------------------------
> 68950 | vehicle_unit_data_20
0407
> 68122 | vehicle_unit_data_20
0301
> NOTICE: Clustering idx_vehicle_unit_dat
a_200407_person_info
rmation__id on vehicle_unit_data_20
0407
> ERROR: deadlock detected
> DETAIL: Process 29022 waits for AccessExclusiveLock on relation 68950 of database 16390; blocked by process 15865.
> Process 15865 waits for AccessShareLock on relation 68122 of database 16390; blocked by process 29022.
> So it seems that it was the clustering of idx_vehicle_unit_dat
a_200407_person_info
rmation__id on vehicle_unit_data_20
0407 that caused the deadlock.
Hmm, the CLUSTER on vehicle_unit_data_20
0407 wouldn't have taken any
lock on vehicle_unit_data_20
0301. Were you perhaps issuing a series
of CLUSTERs inside a transaction block? That would pile up exclusive
locks on all the tables involved, which is certainly deadlock-prone.
I'm also wondering where that NOTICE "Clustering ..." came from, because
there is no such message anywhere in the 8.1 PG sources. You *sure*
this is 8.1?
There's something funny about 15865 too; you said that was an autovacuum
process but I don't think so. VACUUM doesn't take AccessShareLock;
there's a different lock type that that tries to acquire. And it
doesn't take any locks at all on more than one user table at a time.
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings
|
|
|
|
|