Home > Archive > PostgreSQL Bugs > May 2005 > Missing tables in postgresql 7.2.4









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Missing tables in postgresql 7.2.4
Michael Beckstette

2005-05-11, 11:24 am

Hi,

we recently discovered on our production database an a little bit bizarre
problem (after two years stable operations). Some tables are simply missing, or
sometimes the affected table(s) is/are there but not listed in pg_tables.

An example (I am looking for the table kog_blasthit_tables)
:

dev_db=# select version();
version
-------------------------------------------------------------------------
PostgreSQL 7.2.4 on sparc-sun-solaris2.8, compiled by GCC gcc (GCC) 3.1
(1 row)

dev_db=# \d kog_ <TAB EXPANSION>
kog_blasthits_templa
te kog_hits_obj_2374_q4
11_db7
kog_hits_obj_1016_q3
44_db7 kog_hits_obj_2396_q8
33_db7
kog_hits_obj_1341_q4
75_db7 kog_hits_obj_2491_q5
05_db7
kog_hits_obj_1362_q4
12_db7 kog_hits_obj_2516_q9
00_db7
kog_hits_obj_1364_q4
13_db7 kog_hits_obj_2559_q7
8_db7
kog_hits_obj_1584_q5
70_db7 kog_hits_obj_822_q36
9_db7
kog_hits_obj_1604_q5
74_db7 kog_hits_obj_834_q37
1_db7
kog_hits_obj_1660_q6
08_db7 kog_hits_obj_846_q37
5_db7
kog_hits_obj_1725_q6
50_db7 kog_hits_obj_880_q33
9_db7
kog_hits_obj_1737_q6
05_db7 kog_info
kog_hits_obj_2186_q6
86_db7 kog_org

// No kog_blasthit_tables !!!

BUT it seems to be there:

dev_db=# select * FROM kog_blasthit_tables LIMIT 1;
id | object_id | query_set_id | db_set_id | num_of_query_sequenc
es |
table_name | table_desc | number_of_entries |
blast_parameters | blast_prg | evalue_cutoff | owner
| project | created | status |
number_of_unique_ent
ries | reciprocal_first_id | reciprocal_second_id

-----+-----------+--------------+-----------+------------------------+----------------------------+------------+-------------------+-------------------------------------------+-----------+---------------+----------+---------+------------------------------
-+----------+--------------------------+---------------------+----------------------
398 | 1341 | 475 | 7 | 26187 |
kog_hits_obj_1341_q4
75_db7 | na | 0 | -a 1 -m 7 -T F -F
T -p blastx -e 0.100000 | kogblast | 0.1 | mbeckste | 23 |
2004-02-26 18:36:48.048881+01 | finished | 27426 |
-1 | -1
(1 row)

dev_db=#

// The table seems to be there but not in pg_tables

dev_db=# select * FROM pg_tables WHERE tablename='kog_blast
hit_tables';
tablename | tableowner | hasindexes | hasrules | hastriggers
-----------+------------+------------+----------+-------------
(0 rows)


dev_db=# \d pg_ta <TAB EXPANSION>
// nothing but

dev_db=# select count(*) FROM pg_tables;
count
-------
875
(1 row)


The case which is even worth, is that the table is missing completely. We have
a catalog table that keeps track of generated tables, so we know that it has to
be there. The postmaster logs (with high debug level) show no DROP TABLE
between the time of the CREATE TABLE statement of the affected table and the
time where we noticed that it is missing.

I know that this is probably not enough information for a precise analysis. May
be someone can give me a hint how to further investigate this issue.


regards
Michael

--

------------------------------------------------------------------------------
Dipl.-Inform. Michael Beckstette Office: M3-129
AG-PI / Technische Fakultaet EMail:mbeckste@techf
ak.uni-bielefeld.de
Universitaet Bielefeld Fon: +49-521-106-2914
Postfach 100131 Fax: +49-521-106-6411
D-33501 BIELEFELD
Germany

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Tom Lane

2005-05-11, 11:24 am

"Michael Beckstette" <mbeckste@TechFak.Uni-Bielefeld.DE> writes:
> we recently discovered on our production database an a little bit bizarre
> problem (after two years stable operations). Some tables are simply missing, or
> sometimes the affected table(s) is/are there but not listed in pg_tables.


This sounds a bit like a transaction ID wraparound problem. Have you
been vacuuming your whole database on a reasonable schedule? The
missing tables might conceivably be old enough that their pg_class rows
have wrapped around "into the future". It'd be useful to look at
SELECT datname, age(datfrozenxid) FROM pg_database;

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Michael Beckstette

2005-05-11, 11:24 am

Hi Tom,

this is the output from 'SELECT datname, age(datfrozenxid) FROM pg_database;'

datname | age
-------------+-------------
xgc | -1950241750
dev_db | -1886587214
template1 | -1884294460
template0 | -1884294460
promo_db | -1884294460
snap_db_new | -1884294460
gendev_db | 1887538137
(7 rows)

dev_db=#

The affected DB is 'dev_db', although it looks like the others except
'gendev_db' have a wraparound problem too (?). To answer your question about
the VACUUM: We VACUUM FULL a few tables with a high amount of INSERT/DEL
operations once per hour, but I have to admit that we perform a VACUUM of the
whole DB not on a regulary basis. I think the last one was several monthes ago.
Further on we use transactions at several places and we have at least 20
transactions per minute.

Does now a normal VACUUM FULL of the whole DB(s) fix our problem?

Michael


On May 11, 11:51am, Tom Lane wrote:
> Subject: Re: [BUGS] Missing tables in postgresql 7.2.4
> "Michael Beckstette" <mbeckste@TechFak.Uni-Bielefeld.DE> writes:
missing, or[color=darkred]
>
> This sounds a bit like a transaction ID wraparound problem. Have you
> been vacuuming your whole database on a reasonable schedule? The
> missing tables might conceivably be old enough that their pg_class rows
> have wrapped around "into the future". It'd be useful to look at
> SELECT datname, age(datfrozenxid) FROM pg_database;
>
> regards, tom lane




--

------------------------------------------------------------------------------
Dipl.-Inform. Michael Beckstette Office: M3-129
AG-PI / Technische Fakultaet EMail:mbeckste@techf
ak.uni-bielefeld.de
Universitaet Bielefeld Fon: +49-521-106-2914
Postfach 100131 Fax: +49-521-106-6411
D-33501 BIELEFELD
Germany

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Michael Beckstette

2005-05-11, 1:24 pm

Hi,

after reading the docs (I know it was a little bit late), I am now relatively
sure that I trapped into a transaction ID wraparound problem. For me its now a
little bit unclear, how to proceed in order to minimize the caused damage.

I checked all tables in the affected DB. Till now, the results are as follows:

6 user tables are completely lost.
8 user tables are not listed in pg_tables but still accessible by a
SELECT.

The 6 completely lost tables are not so dramatical, because they contain only
static data, that I can restore from the development system. But what happens
with the 8 tables that are still accessable, but not listed in pg_tables, after
a VACUUM? Will they be removed completely or 'reinserted' into pg_tables?

Does anyone has an advise how to proceed in this situation?

Regards
Michael



On May 11, 6:11pm, Michael Beckstette wrote:
> Subject: Re: [BUGS] Missing tables in postgresql 7.2.4
> Hi Tom,
>
> this is the output from 'SELECT datname, age(datfrozenxid) FROM pg_database;'
>
> datname | age
> -------------+-------------
> xgc | -1950241750
> dev_db | -1886587214
> template1 | -1884294460
> template0 | -1884294460
> promo_db | -1884294460
> snap_db_new | -1884294460
> gendev_db | 1887538137
> (7 rows)
>
> dev_db=#
>
> The affected DB is 'dev_db', although it looks like the others except
> 'gendev_db' have a wraparound problem too (?). To answer your question about
> the VACUUM: We VACUUM FULL a few tables with a high amount of INSERT/DEL
> operations once per hour, but I have to admit that we perform a VACUUM of the
> whole DB not on a regulary basis. I think the last one was several monthes

ago.
> Further on we use transactions at several places and we have at least 20
> transactions per minute.
>
> Does now a normal VACUUM FULL of the whole DB(s) fix our problem?
>
> Michael
>
>
> On May 11, 11:51am, Tom Lane wrote:
> missing, or
>
>
>
> --
>
> ------------------------------------------------------------------------------
> Dipl.-Inform. Michael Beckstette Office: M3-129
> AG-PI / Technische Fakultaet

EMail:mbeckste@techf
ak.uni-bielefeld.de
> Universitaet Bielefeld Fon: +49-521-106-2914
> Postfach 100131 Fax: +49-521-106-6411
> D-33501 BIELEFELD
> Germany
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/docs/faq
>-- End of excerpt from Michael Beckstette




--

------------------------------------------------------------------------------
Dipl.-Inform. Michael Beckstette Office: M3-129
AG-PI / Technische Fakultaet EMail:mbeckste@techf
ak.uni-bielefeld.de
Universitaet Bielefeld Fon: +49-521-106-2914
Postfach 100131 Fax: +49-521-106-6411
D-33501 BIELEFELD
Germany

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Tom Lane

2005-05-11, 1:24 pm

"Michael Beckstette" <mbeckste@TechFak.Uni-Bielefeld.DE> writes:
> The 6 completely lost tables are not so dramatical, because they contain only
> static data, that I can restore from the development system. But what happens
> with the 8 tables that are still accessable, but not listed in pg_tables, after
> a VACUUM? Will they be removed completely or 'reinserted' into pg_tables?


> Does anyone has an advise how to proceed in this situation?


What I would recommend as a first step is to stop the postmaster and
then take a tarball backup of the entire $PGDATA tree. This will at
least provide a chance to go back if subsequent tries mess things up
completely.

After that, I'd try a plain VACUUM (not FULL, not FREEZE) of pg_class
and see if that restores the missing tables to view in pg_tables.
If it does, go ahead and do a database-wide plain VACUUM, and you
should be OK. If it doesn't, we'll need to think of another plan.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Michael Beckstette

2005-05-11, 8:24 pm

Hi,


On May 11, 2:39pm, Tom Lane wrote:

> What I would recommend as a first step is to stop the postmaster and
> then take a tarball backup of the entire $PGDATA tree. This will at
> least provide a chance to go back if subsequent tries mess things up
> completely.


Done. This was probably the biggest tar ball I have ever build (~450GB) ;)


>
> After that, I'd try a plain VACUUM (not FULL, not FREEZE) of pg_class
> and see if that restores the missing tables to view in pg_tables.


Done. The tables that were missing in pg_tables (but accessible with SELECT)
are now listed in pg_tables.


> If it does, go ahead and do a database-wide plain VACUUM, and you
> should be OK.


Done. As far as I can tell, everything is OK again.


Thanks a lot Tom!


P.S.:A TODO for me: CRON Script for weekly VACUUM ;)


Michael

--

------------------------------------------------------------------------------
Dipl.-Inform. Michael Beckstette Office: M3-129
AG-PI / Technische Fakultaet EMail:mbeckste@techf
ak.uni-bielefeld.de
Universitaet Bielefeld Fon: +49-521-106-2914
Postfach 100131 Fax: +49-521-106-6411
D-33501 BIELEFELD
Germany

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

John R Pierce

2005-05-12, 3:24 am

> P.S.:A TODO for me: CRON Script for weekly VACUUM ;)


on heavy use databases, mine generally does a light vacuum every 4 hours, and a
once a day full on everything. also, a weekly full reindex on a really really
heavy use systems like this one message board server I ad-mangle

something like...

7 */4 * * * vacuumdb busy_database >/dev/null 2>/dev/null
37 0 * * * vacuumdb -a -z >/dev/null 2>/dev/null
37 2 * * Sun reindexdb -a > /dev/null 2>/dev/null



many might say this is total overkill, I dunno. keeps this server happy. btw,
reindexdb is a script from contrib.

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Tom Lane

2005-05-12, 3:24 am

"Michael Beckstette" <mbeckste@TechFak.Uni-Bielefeld.DE> writes:
> On May 11, 2:39pm, Tom Lane wrote:
[color=darkred]
> Done. As far as I can tell, everything is OK again.


Sweet ;-) In the words of my former business partner, a private pilot
with more hours aloft than many airline captains: "Walked away from
another one ..."

For the benefit of onlookers, the gambit being played here went like
this: the missing pg_class rows must have fairly recently wrapped around
the 2G transaction mark with respect to the current XID counter. That
made them "in the future" not "in the past" as far as normal queries
go. However, a VACUUM will freeze-as-good any tuples that are "in the
past" with respect to the vacuum freeze time, which for a plain VACUUM
is 1G transactions ago. So as long as Michael notices he has a problem
within 1 billion transactions of having a problem, he can get out of it.

I cannot claim that this behavior was operating-as-designed, because
I'm pretty sure we hadn't thought it through when planning the
wraparound XID behavior. But we walked away from another one.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql
.org

Michael Beckstette

2005-05-12, 9:25 am

Hello,

two last questions about the transaction ID wraparound problem of my DBs I had
yesterday (see former postings). After applying the 'recovery procedure' Tom
suggested now my XIDs are looking almost fine again, except for the template0
DB.

dev_db=# SELECT datname, age(datfrozenxid) FROM pg_database;

datname | age
-------------+-------------
xgc | 1075434814
dev_db | 1074296718
template1 | 1075486644
template0 | -1882457315
promo_db | 1075528357
snap_db_new | 1075528467
gendev_db | 1075512627

Is this negligible or can it cause any harm in the future? Is there a way to
VACUUM template0 as well?

My second point is more a suggestion. After Toms strategy worked quite well (at
least for me) it is maybe worthwhile to put it somewhere in the docs. Maybe
together with other things in a section called 'Disaster recovery strategies'?


regards
Michael

--

------------------------------------------------------------------------------
Dipl.-Inform. Michael Beckstette Office: M3-129
AG-PI / Technische Fakultaet EMail:mbeckste@techf
ak.uni-bielefeld.de
Universitaet Bielefeld Fon: +49-521-106-2914
Postfach 100131 Fax: +49-521-106-6411
D-33501 BIELEFELD
Germany

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Tom Lane

2005-05-12, 9:25 am

"Michael Beckstette" <mbeckste@TechFak.Uni-Bielefeld.DE> writes:
> two last questions about the transaction ID wraparound problem of my DBs I had
> yesterday (see former postings). After applying the 'recovery procedure' Tom
> suggested now my XIDs are looking almost fine again, except for the template0
> DB.


You don't have to worry about template0 -- it was "frozen" during initdb
and does not need vacuuming.

> My second point is more a suggestion. After Toms strategy worked quite
> well (at least for me) it is maybe worthwhile to put it somewhere in
> the docs. Maybe together with other things in a section called
> 'Disaster recovery strategies'?


I'm not sure it can be claimed to be tested well enough to publish as a
recovery strategy. I'm glad it worked for you, but ...

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Sponsored Links





Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive | Programming forum archive

Copyright 2008 droptable.com