|
Home > Archive > PostgreSQL Hacks > January 2006 > stats for failed transactions (was Re: [GENERAL] VACUUM Question)
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
stats for failed transactions (was Re: [GENERAL] VACUUM Question)
|
|
| Tom Lane 2006-01-27, 9:24 am |
| "Matthew T. O'Connor" <matthew@zeut.net> writes:
[color=darkred]
> hmm... That's true. I don't think autovacuum doesn't anything to account
> for the concept of rolledback inserts.
I think this is the fault of the stats system design. AFAICT from a
quick look at the code, inserted/updated/deleted tuples are reported
to the collector in the same way regardless of whether the sending
transaction committed or rolled back. I think this is unquestionably
a bug, at least for autovacuum's purposes --- though it might be OK
for the original intent of the stats system, which was simply to track
activity levels.
Any thoughts about how it ought to work?
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql
.org so that your
message can get through to the mailing list cleanly
| |
| Alvaro Herrera 2006-01-27, 11:24 am |
| Tom Lane wrote:
> I think this is the fault of the stats system design. AFAICT from a
> quick look at the code, inserted/updated/deleted tuples are reported
> to the collector in the same way regardless of whether the sending
> transaction committed or rolled back. I think this is unquestionably
> a bug, at least for autovacuum's purposes --- though it might be OK
> for the original intent of the stats system, which was simply to track
> activity levels.
>
> Any thoughts about how it ought to work?
I don't remember exactly how it works -- I think the activity (insert,
update, delete) counters are kept separately from commit/rollback
status, right? Maybe we should keep three separate counters: "current
transaction counters" and "counters for transactions that were
aborted/committed". We only send the latter counts, and the former are
added to them when the transaction ends.
--
Alvaro Herrera http://www.advogato.org/person/alvherre
"Aprende a avergonzarte más ante ti que ante los demás" (Demócrito)
---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend
| |
| Tom Lane 2006-01-27, 11:24 am |
| Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> Tom Lane wrote:
[color=darkred]
> I don't remember exactly how it works -- I think the activity (insert,
> update, delete) counters are kept separately from commit/rollback
> status, right? Maybe we should keep three separate counters: "current
> transaction counters" and "counters for transactions that were
> aborted/committed". We only send the latter counts, and the former are
> added to them when the transaction ends.
My question was at a higher level, actually: *what* should we be
counting?
I think doubling the number of counters in the stats system, which is
what you seem to be proposing, is probably not acceptable --- we've
already got a problem with the stats file becoming unreasonably bulky.
We need to figure out exactly which counts there is adequate reason
to be tracking.
I don't, for instance, see any percentage in tracking block-level I/O
operations separately for committed and rolled-back transactions.
Those numbers are certainly things you watch only for total activity,
and a failed xact is just as much system load as a committed one.
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?
http://archives.postgresql.org
| |
| Alvaro Herrera 2006-01-27, 11:24 am |
| Tom Lane wrote:
> Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
>
>
> My question was at a higher level, actually: *what* should we be
> counting?
Oh, I see. Do you think small incremental improvements to the stat
system will buy us much? I think we should be thinking big here, i.e.
rewrite most stuff instead. In the meantime, we should fix the minor
issues but not spend too much time on it; IMHO anyway.
I can devote some time to it starting from, say, mid february, which is
when I think I'm going to have more time to spend on community stuff.
(I've been spending the last couple of months on PL/php and internal
Command Prompt stuff.)
--
Alvaro Herrera http://www.amazon.com/gp/registry/5ZYLFMCVHXC
"XML!" Exclaimed C++. "What are you doing here? You're not a programming
language."
"Tell that to the people who use me," said XML.
---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?
http://archives.postgresql.org
| |
| Tom Lane 2006-01-27, 11:24 am |
| Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> Tom Lane wrote:
[color=darkred]
> Oh, I see. Do you think small incremental improvements to the stat
> system will buy us much? I think we should be thinking big here, i.e.
> rewrite most stuff instead.
Uh, I wasn't aware of any proposals for a major rewrite. What did you
have in mind?
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match
| |
| Alvaro Herrera 2006-01-27, 1:24 pm |
| Tom Lane wrote:
> Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
>
>
> Uh, I wasn't aware of any proposals for a major rewrite. What did you
> have in mind?
Nothing yet really. But we mentioned it in the recent past. (About not
using UDP, and also about the property of not having each backend load
the whole stat file every time.)
--
Alvaro Herrera http://www.PlanetPostgreSQL.org
"This is a foot just waiting to be shot" (Andrew Dunstan)
---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?
http://www.postgresql.org/docs/faq
| |
| Tom Lane 2006-01-28, 9:23 am |
| "Matthew T. O'Connor" <matthew@zeut.net> writes:
> None of this directly addresses the question of what the stats system
> *should* track, but perhaps it is wrongheaded to totally redesign the
> stats system for the purposes of autovacuum.
I'd argue it's fine: there are tons of people using row-level stats
via autovacuum, and (AFAICT) just about nobody using 'em for any other
purpose. Certainly you never see anyone suggesting them as a tool for
investigating problems on pgsql-performance. Sure, it's a repurposing
of the stats subsystem, but we should be willing to do that when field
experience teaches us what's really needed.
> As a quick semi-fix, perhaps autovacuum should look at the number of
> rollbacks vs. commits in an attempt to determine the accuracy of the
> stats. For example if 50% of the transactions are getting rolled
> back, then autovacuum might include 50% of the inserts in the count
> towards the vacuum threshold. Obviously this isn't perfect, but it
> probably gets us closer to reality with the information already
> available.
But all that we have is *global* counts. Inferring ratios applicable to
particular tables seems a big stretch to me. Any given application is
likely to have some types of transactions that roll back much more often
than others.
One thing we could do is tie the stats message sending more tightly to
top-level transaction commit/abort. (It's already effectively true that
we send stats only after commit/abort, but we send 'em from the wrong
place, ie PostgresMain.) Then the semantics of the message could be
read as "here's what I did before committing" or "here's what I did
before aborting" and the collector could interpret the counts
accordingly. However, this still fails in the case where a committed
top-level transaction includes some failed subtransactions. I think
the only full solution will involve backends doing some extra work at
subtransaction commit/abort so that they can report properly classified
update counts.
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?
http://www.postgresql.org/docs/faq
| |
| Tom Lane 2006-01-28, 11:23 am |
| "Matthew T. O'Connor" <matthew@zeut.net> writes:
> Tom Lane wrote:
[color=darkred]
> Any guess as to the performance implications?
Pushing some counts from one place to another doesn't seem that
expensive, but it'd be nice to avoid scanning a lot of unrelated
table-stats entries to find the ones that have to be adjusted.
Not sure what it'll take exactly.
Or we could blow it off for the time being. Certainly, getting
things right at the top-transaction level would already be a big
leg up in accuracy from where we are, and I don't think that would
be hard at all.
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend
|
|
|
|
|