Home > Archive > PostgreSQL Performance > September 2005 > Query slower on 8.0.3 (Windows) vs 7.3 (cygwin)









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Query slower on 8.0.3 (Windows) vs 7.3 (cygwin)
Gurpreet Aulakh

2005-09-21, 8:24 pm

I currently have a Postgres 7.3 database running under WIN2K using cygwin
and want to move to Postgres 8.0.3 (native windows version).
I am finding most simple queries are significantly faster on the native
windows version compared to 7.3 (under cygwin).
However, for a complex query, that involve multiple JOINs, the 7.3 version
is actually faster (about 2X faster).

The query that I am running was optimized to run under 7.3. It was
specifically modified to control the planner with explicit JOINs.
When I run the same query on the 8.0.3 version with the join_collapse_limit
set to 1 the query is slower.

Can someone tell me why setting the join_collapse_limit to 1 in the 8.0
version does not produce similar results to the 7.3 version?
Does anyone have any suggestions on what I can do? Do I have to rewrite the
query?


Here are the results of an explain analyze on the query.

Explain analyze Postgres 7.3 running on WIN2K using cygwin.

Hash Join (cost=21808.27..1946264.80 rows=2982 width=1598) (actual
time=2186.00..2320.00 rows=50 loops=1)
Hash Cond: ("outer". doc_internalparentom
xref = "inner".doc_documentid)
-> Hash Join (cost=20948.78..1945323.29 rows=2982 width=1534) (actual
time=2110.00..2227.00 rows=50 loops=1)
Hash Cond: ("outer". doc_internalrootxref
= "inner".doc_documentid)
-> Hash Join (cost=20089.29..1944381.79 rows=2982 width=1484)
(actual time=2067.00..2179.00 rows=50 loops=1)
Hash Cond: ("outer".doc_documentid = "inner".doc_documentid)
Join Filter: ("inner".dc_doccontacttype = 'FROM'::character
varying)
-> Hash Join (cost=7455.14..1928613.59 rows=2982
width=1138) (actual time=1216.00..1539.00 rows=50 loops=1)
Hash Cond: ("outer".doc_documentid =
"inner".doc_documentid)
Join Filter: ("inner".dc_doccontacttype =
'TO'::character varying)
-> Hash Join (cost=183.49..1918519.06 rows=2860
width=792) (actual time=64.00..301.00 rows=50 loops=1)
Hash Cond: ("outer".doc_documentid =
"inner".doc_documentid)
-> Seq Scan on document finaldoc
(cost=0.00..1918256.94 rows=2860 width=717) (actual time=13.00..254.00
rows=50 loops=1)
Filter: (subplan)
SubPlan
-> Materialize (cost=335.27..335.27
rows=50 width=160) (actual time=0.00..0.01 rows=50 loops=5719)
-> Limit (cost=0.00..335.27
rows=50 width=160) (actual time=3.00..8.00 rows=50 loops=1)
-> Nested Loop
(cost=0.00..38347.95 rows=5719 width=160) (actual time=3.00..8.00 rows=51
loops=1)
-> Merge Join
(cost=0.00..3910.14 rows=5719 width=120) (actual time=3.00..3.00 rows=51
loops=1)
Merge Cond:
("outer".doc_documentid = "inner".doc_documentid)
-> Index Scan
using pk_document on document doc (cost=0.00..3256.48 rows=5719 width=80)
(actual time=1.00..1.00 rows=51 loops=1)
-> Index Scan
using pk_folder_document on folder_document (cost=0.00..553.91 rows=5719
width=40) (actual time=2.00..2.00 rows=51 loops=1)
-> Index Scan using
pk_document on document root (cost=0.00..6.01 rows=1 width=40) (actual
time=0.10..0.10 rows=1 loops=51)
Index Cond:
("outer". doc_internalrootxref
= root.doc_documentid)
-> Hash (cost=169.19..169.19 rows=5719
width=75) (actual time=31.00..31.00 rows=0 loops=1)
-> Seq Scan on folder_document
(cost=0.00..169.19 rows=5719 width=75) (actual time=0.00..11.00 rows=5719
loops=1)
-> Hash (cost=1328.80..1328.80 rows=34280 width=346)
(actual time=846.00..846.00 rows=0 loops=1)
-> Seq Scan on doccontact dcto
(cost=0.00..1328.80 rows=34280 width=346) (actual time=0.00..175.00
rows=34280 loops=1)
-> Hash (cost=1328.80..1328.80 rows=34280 width=346)
(actual time=445.00..445.00 rows=0 loops=1)
-> Seq Scan on doccontact dcfrom (cost=0.00..1328.80
rows=34280 width=346) (actual time=0.00..223.00 rows=34280 loops=1)
-> Hash (cost=845.19..845.19 rows=5719 width=50) (actual
time=42.00..42.00 rows=0 loops=1)
-> Seq Scan on document root (cost=0.00..845.19 rows=5719
width=50) (actual time=0.00..2.00 rows=5719 loops=1)
-> Hash (cost=845.19..845.19 rows=5719 width=64) (actual
time=73.00..73.00 rows=0 loops=1)
-> Seq Scan on document parentom (cost=0.00..845.19 rows=5719
width=64) (actual time=0.00..30.00 rows=5719 loops=1)
SubPlan
-> Limit (cost=0.00..5.56 rows=1 width=40) (actual time=0.06..0.06
rows=0 loops=50)
-> Result (cost=0.00..7.20 rows=1 width=40) (actual
time=0.06..0.06 rows=0 loops=50)
One-Time Filter: ($0 = true)
-> Index Scan using documentevent_index on documentevent
de (cost=0.00..7.20 rows=1 width=40) (actual time=0.07..0.07 rows=0
loops=44)
Index Cond: (($1 = doc_documentid) AND
(de_processedflag = false) AND (de_documenteventsta
tus = 'ERROR'::character
varying))
-> Limit (cost=0.00..3.86 rows=1 width=40) (actual time=0.10..0.10
rows=0 loops=50)

Explain analyze Postgres 8.0.3 running natively under WIN2K.

Hash IN Join (cost=5293.09..7121.89 rows=50 width=1369) (actual
time=1062.000..5558.000 rows=50 loops=1)
Hash Cond: (("outer". doc_documentid)::tex
t =
("inner". doc_documentid)::tex
t)
-> Hash Left Join (cost=4798.24..6199.29 rows=5741 width=1369) (actual
time=751.000..4236.000 rows=5719 loops=1)
Hash Cond: (("outer". doc_internalparentom
xref)::text =
("inner". doc_documentid)::tex
t)
-> Hash Left Join (cost=3956.48..5271.41 rows=5741 width=1345)
(actual time=541.000..3105.000 rows=5719 loops=1)
Hash Cond: (("outer". doc_internalrootxref
)::text =
("inner". doc_documentid)::tex
t)
-> Hash Left Join (cost=3114.72..4343.53 rows=5741
width=1335) (actual time=501.000..2313.000 rows=5719 loops=1)
Hash Cond: (("outer". doc_documentid)::tex
t =
("inner". doc_documentid)::tex
t)
-> Hash Left Join (cost=1649.92..2721.09 rows=5741
width=1039) (actual time=180.000..1342.000 rows=5719 loops=1)
Hash Cond: (("outer". doc_documentid)::tex
t =
("inner". doc_documentid)::tex
t)
-> Hash Left Join (cost=185.13..1098.65
rows=5741 width=743) (actual time=40.000..592.000 rows=5719 loops=1)
Hash Cond: (("outer". doc_documentid)::tex
t
= ("inner". doc_documentid)::tex
t)
-> Seq Scan on document finaldoc
(cost=0.00..827.41 rows=5741 width=708) (actual time=0.000..41.000 rows=5719
loops=1)
-> Hash (cost=170.70..170.70 rows=5770
width=75) (actual time=40.000..40.000 rows=0 loops=1)
-> Seq Scan on folder_document
(cost=0.00..170.70 rows=5770 width=75) (actual time=0.000..10.000 rows=5719
loops=1)
-> Hash (cost=1450.50..1450.50 rows=5718
width=336) (actual time=140.000..140.000 rows=0 loops=1)
-> Seq Scan on doccontact dcto
(cost=0.00..1450.50 rows=5718 width=336) (actual time=0.000..130.000
rows=5718 loops=1)
Filter: ((dc_doccontacttype)
::text =
'TO'::text)
-> Hash (cost=1450.50..1450.50 rows=5718 width=336)
(actual time=321.000..321.000 rows=0 loops=1)
-> Seq Scan on doccontact dcfrom
(cost=0.00..1450.50 rows=5718 width=336) (actual time=10.000..291.000
rows=5718 loops=1)
Filter: ((dc_doccontacttype)
::text =
'FROM'::text)
-> Hash (cost=827.41..827.41 rows=5741 width=50) (actual
time=40.000..40.000 rows=0 loops=1)
-> Seq Scan on document root (cost=0.00..827.41
rows=5741 width=50) (actual time=0.000..30.000 rows=5719 loops=1)
-> Hash (cost=827.41..827.41 rows=5741 width=64) (actual
time=210.000..210.000 rows=0 loops=1)
-> Seq Scan on document parentom (cost=0.00..827.41
rows=5741 width=64) (actual time=0.000..160.000 rows=5719 loops=1)
-> Hash (cost=494.73..494.73 rows=50 width=42) (actual
time=261.000..261.000 rows=0 loops=1)
-> Subquery Scan "IN_subquery" (cost=185.13..494.73 rows=50
width=42) (actual time=101.000..261.000 rows=50 loops=1)
-> Limit (cost=185.13..494.23 rows=50 width=40) (actual
time=101.000..261.000 rows=50 loops=1)
-> Nested Loop Left Join (cost=185.13..35676.18
rows=5741 width=40) (actual time=101.000..261.000 rows=50 loops=1)
-> Hash Left Join (cost=185.13..1098.65
rows=5741 width=80) (actual time=91.000..91.000 rows=50 loops=1)
Hash Cond: (("outer". doc_documentid)::tex
t
= ("inner". doc_documentid)::tex
t)
-> Seq Scan on document doc
(cost=0.00..827.41 rows=5741 width=80) (actual time=10.000..10.000 rows=50
loops=1)
-> Hash (cost=170.70..170.70 rows=5770
width=40) (actual time=81.000..81.000 rows=0 loops=1)
-> Seq Scan on folder_document
(cost=0.00..170.70 rows=5770 width=40) (actual time=10.000..61.000 rows=5719
loops=1)
-> Index Scan using pk_document on document root
(cost=0.00..6.01 rows=1 width=40) (actual time=3.400..3.400 rows=1 loops=50)
Index Cond:
(("outer". doc_internalrootxref
)::text = (root. doc_documentid)::tex
t)
SubPlan
-> Limit (cost=0.00..1.96 rows=1 width=40) (actual time=0.400..0.400
rows=0 loops=50)
-> Seq Scan on followup_document fd (cost=0.00..3.91 rows=2
width=40) (actual time=0.400..0.400 rows=0 loops=50)
Filter: (($1)::text = (doc_documentid)::te
xt)
-> Limit (cost=0.00..6.01 rows=1 width=40) (actual
time=17.620..17.620 rows=0 loops=50)
-> Result (cost=0.00..6.01 rows=1 width=40) (actual
time=17.620..17.620 rows=0 loops=50)
One-Time Filter: ($0 = true)
-> Index Scan using documentevent_index on documentevent
de (cost=0.00..6.01 rows=1 width=40) (actual time=28.419..28.419 rows=0
loops=31)
Index Cond: ((($1)::text = (doc_documentid)::te
xt)
AND (de_processedflag = false) AND ((de_documenteventst
atus)::text =
'ERROR'::text))
Total runtime: 5558.000 ms


---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Gurpreet Aulakh

2005-09-21, 8:24 pm

I have started to break my query down and analyze each piece.
What I have discovered is very interesting.

First here is a small piece of my query.

EXPLAIN ANALYZE SELECT doc.doc_documentid FROM document AS doc
LEFT JOIN document as root ON doc. doc_internalRootXref
=
root.doc_documentId
LEFT JOIN folder_document ON doc.doc_documentid =
folder_document.doc_documentId LIMIT 500 OFFSET 0

When I run this on Postgres 8.0.3 running under windows this is the result

QUERY PLAN
Limit (cost=183.49..753.41 rows=500 width=40) (actual time=47.000..79.000
rows=500 loops=1)
-> Hash Left Join (cost=183.49..6702.23 rows=5719 width=40) (actual
time=47.000..79.000 rows=500 loops=1)
Hash Cond: (("outer". doc_documentid)::tex
t =
("inner". doc_documentid)::tex
t)
-> Merge Left Join (cost=0.00..6432.96 rows=5719 width=40) (actual
time=0.000..16.000 rows=500 loops=1)
Merge Cond: (("outer". doc_internalrootxref
)::text =
("inner". doc_documentid)::tex
t)
-> Index Scan using doc_internalrootxref
_index on document
doc (cost=0.00..3172.64 rows=5719 width=80) (actual time=0.000..0.000
rows=500 loops=1)
-> Index Scan using pk_document on document root
(cost=0.00..3174.53 rows=5719 width=40) (actual time=0.000..0.000 rows=863
loops=1)
-> Hash (cost=169.19..169.19 rows=5719 width=40) (actual
time=47.000..47.000 rows=0 loops=1)
-> Seq Scan on folder_document (cost=0.00..169.19 rows=5719
width=40) (actual time=0.000..16.000 rows=5719 loops=1)
Total runtime: 79.000 ms

Here is the result of running the same query on the Postgres 7.3 running
under Cygwin

QUERY PLAN
Limit (cost=183.49..775.31 rows=500 width=160) (actual time=13.00..44.00
rows=500 loops=1)
-> Hash Join (cost=183.49..6952.79 rows=5719 width=160) (actual
time=13.00..44.00 rows=501 loops=1)
Hash Cond: ("outer".doc_documentid = "inner".doc_documentid)
-> Merge Join (cost=0.00..6612.03 rows=5719 width=120) (actual
time=0.00..29.00 rows=775 loops=1)
Merge Cond: ("outer". doc_internalrootxref
=
"inner".doc_documentid)
-> Index Scan using doc_internalrootxref
_index on document
doc (cost=0.00..3254.39 rows=5719 width=80) (actual time=0.00..7.00
rows=775 loops=1)
-> Index Scan using pk_document on document root
(cost=0.00..3257.88 rows=5719 width=40) (actual time=0.00..15.00 rows=1265
loops=1)
-> Hash (cost=169.19..169.19 rows=5719 width=40) (actual
time=12.00..12.00 rows=0 loops=1)
-> Seq Scan on folder_document (cost=0.00..169.19 rows=5719
width=40) (actual time=0.00..9.00 rows=5719 loops=1)
Total runtime: 45.00 msec

What is really interesting is the time it takes for the Hash to occur. For
the first hash, on the 7.3 it takes only 12ms while on the 8.0 it takes
47ms.
Now the databases are created from the same data and I have run
vacuumdb -f -z on the databases.

Now I have read something on the archives that stated that perhaps the data
is in the filesystem (not database) cache. Would this be the case?. If so
how would I improve the performance under WIN2K?

Anyone have any ideas?


-----Original Message-----
From: pgsql-performance-owner@postgresql.org
[mailto:pgsql-performance-owner@postgresql.org]On Behalf Of Gurpreet
Aulakh
Sent: September 21, 2005 12:38 PM
To: pgsql- performance@postgres
ql.org
Subject: [PERFORM] Query slower on 8.0.3 (Windows) vs 7.3 (cygwin)


I currently have a Postgres 7.3 database running under WIN2K using cygwin
and want to move to Postgres 8.0.3 (native windows version).
I am finding most simple queries are significantly faster on the native
windows version compared to 7.3 (under cygwin).
However, for a complex query, that involve multiple JOINs, the 7.3 version
is actually faster (about 2X faster).

The query that I am running was optimized to run under 7.3. It was
specifically modified to control the planner with explicit JOINs.
When I run the same query on the 8.0.3 version with the join_collapse_limit
set to 1 the query is slower.

Can someone tell me why setting the join_collapse_limit to 1 in the 8.0
version does not produce similar results to the 7.3 version?
Does anyone have any suggestions on what I can do? Do I have to rewrite the
query?


Here are the results of an explain analyze on the query.

Explain analyze Postgres 7.3 running on WIN2K using cygwin.

Hash Join (cost=21808.27..1946264.80 rows=2982 width=1598) (actual
time=2186.00..2320.00 rows=50 loops=1)
Hash Cond: ("outer". doc_internalparentom
xref = "inner".doc_documentid)
-> Hash Join (cost=20948.78..1945323.29 rows=2982 width=1534) (actual
time=2110.00..2227.00 rows=50 loops=1)
Hash Cond: ("outer". doc_internalrootxref
= "inner".doc_documentid)
-> Hash Join (cost=20089.29..1944381.79 rows=2982 width=1484)
(actual time=2067.00..2179.00 rows=50 loops=1)
Hash Cond: ("outer".doc_documentid = "inner".doc_documentid)
Join Filter: ("inner".dc_doccontacttype = 'FROM'::character
varying)
-> Hash Join (cost=7455.14..1928613.59 rows=2982
width=1138) (actual time=1216.00..1539.00 rows=50 loops=1)
Hash Cond: ("outer".doc_documentid =
"inner".doc_documentid)
Join Filter: ("inner".dc_doccontacttype =
'TO'::character varying)
-> Hash Join (cost=183.49..1918519.06 rows=2860
width=792) (actual time=64.00..301.00 rows=50 loops=1)
Hash Cond: ("outer".doc_documentid =
"inner".doc_documentid)
-> Seq Scan on document finaldoc
(cost=0.00..1918256.94 rows=2860 width=717) (actual time=13.00..254.00
rows=50 loops=1)
Filter: (subplan)
SubPlan
-> Materialize (cost=335.27..335.27
rows=50 width=160) (actual time=0.00..0.01 rows=50 loops=5719)
-> Limit (cost=0.00..335.27
rows=50 width=160) (actual time=3.00..8.00 rows=50 loops=1)
-> Nested Loop
(cost=0.00..38347.95 rows=5719 width=160) (actual time=3.00..8.00 rows=51
loops=1)
-> Merge Join
(cost=0.00..3910.14 rows=5719 width=120) (actual time=3.00..3.00 rows=51
loops=1)
Merge Cond:
("outer".doc_documentid = "inner".doc_documentid)
-> Index Scan
using pk_document on document doc (cost=0.00..3256.48 rows=5719 width=80)
(actual time=1.00..1.00 rows=51 loops=1)
-> Index Scan
using pk_folder_document on folder_document (cost=0.00..553.91 rows=5719
width=40) (actual time=2.00..2.00 rows=51 loops=1)
-> Index Scan using
pk_document on document root (cost=0.00..6.01 rows=1 width=40) (actual
time=0.10..0.10 rows=1 loops=51)
Index Cond:
("outer". doc_internalrootxref
= root.doc_documentid)
-> Hash (cost=169.19..169.19 rows=5719
width=75) (actual time=31.00..31.00 rows=0 loops=1)
-> Seq Scan on folder_document
(cost=0.00..169.19 rows=5719 width=75) (actual time=0.00..11.00 rows=5719
loops=1)
-> Hash (cost=1328.80..1328.80 rows=34280 width=346)
(actual time=846.00..846.00 rows=0 loops=1)
-> Seq Scan on doccontact dcto
(cost=0.00..1328.80 rows=34280 width=346) (actual time=0.00..175.00
rows=34280 loops=1)
-> Hash (cost=1328.80..1328.80 rows=34280 width=346)
(actual time=445.00..445.00 rows=0 loops=1)
-> Seq Scan on doccontact dcfrom (cost=0.00..1328.80
rows=34280 width=346) (actual time=0.00..223.00 rows=34280 loops=1)
-> Hash (cost=845.19..845.19 rows=5719 width=50) (actual
time=42.00..42.00 rows=0 loops=1)
-> Seq Scan on document root (cost=0.00..845.19 rows=5719
width=50) (actual time=0.00..2.00 rows=5719 loops=1)
-> Hash (cost=845.19..845.19 rows=5719 width=64) (actual
time=73.00..73.00 rows=0 loops=1)
-> Seq Scan on document parentom (cost=0.00..845.19 rows=5719
width=64) (actual time=0.00..30.00 rows=5719 loops=1)
SubPlan
-> Limit (cost=0.00..5.56 rows=1 width=40) (actual time=0.06..0.06
rows=0 loops=50)
-> Result (cost=0.00..7.20 rows=1 width=40) (actual
time=0.06..0.06 rows=0 loops=50)
One-Time Filter: ($0 = true)
-> Index Scan using documentevent_index on documentevent
de (cost=0.00..7.20 rows=1 width=40) (actual time=0.07..0.07 rows=0
loops=44)
Index Cond: (($1 = doc_documentid) AND
(de_processedflag = false) AND (de_documenteventsta
tus = 'ERROR'::character
varying))
-> Limit (cost=0.00..3.86 rows=1 width=40) (actual time=0.10..0.10
rows=0 loops=50)

Explain analyze Postgres 8.0.3 running natively under WIN2K.

Hash IN Join (cost=5293.09..7121.89 rows=50 width=1369) (actual
time=1062.000..5558.000 rows=50 loops=1)
Hash Cond: (("outer". doc_documentid)::tex
t =
("inner". doc_documentid)::tex
t)
-> Hash Left Join (cost=4798.24..6199.29 rows=5741 width=1369) (actual
time=751.000..4236.000 rows=5719 loops=1)
Hash Cond: (("outer". doc_internalparentom
xref)::text =
("inner". doc_documentid)::tex
t)
-> Hash Left Join (cost=3956.48..5271.41 rows=5741 width=1345)
(actual time=541.000..3105.000 rows=5719 loops=1)
Hash Cond: (("outer". doc_internalrootxref
)::text =
("inner". doc_documentid)::tex
t)
-> Hash Left Join (cost=3114.72..4343.53 rows=5741
width=1335) (actual time=501.000..2313.000 rows=5719 loops=1)
Hash Cond: (("outer". doc_documentid)::tex
t =
("inner". doc_documentid)::tex
t)
-> Hash Left Join (cost=1649.92..2721.09 rows=5741
width=1039) (actual time=180.000..1342.000 rows=5719 loops=1)
Hash Cond: (("outer". doc_documentid)::tex
t =
("inner". doc_documentid)::tex
t)
-> Hash Left Join (cost=185.13..1098.65
rows=5741 width=743) (actual time=40.000..592.000 rows=5719 loops=1)
Hash Cond: (("outer". doc_documentid)::tex
t
= ("inner". doc_documentid)::tex
t)
-> Seq Scan on document finaldoc
(cost=0.00..827.41 rows=5741 width=708) (actual time=0.000..41.000 rows=5719
loops=1)
-> Hash (cost=170.70..170.70 rows=5770
width=75) (actual time=40.000..40.000 rows=0 loops=1)
-> Seq Scan on folder_document
(cost=0.00..170.70 rows=5770 width=75) (actual time=0.000..10.000 rows=5719
loops=1)
-> Hash (cost=1450.50..1450.50 rows=5718
width=336) (actual time=140.000..140.000 rows=0 loops=1)
-> Seq Scan on doccontact dcto
(cost=0.00..1450.50 rows=5718 width=336) (actual time=0.000..130.000
rows=5718 loops=1)
Filter: ((dc_doccontacttype)
::text =
'TO'::text)
-> Hash (cost=1450.50..1450.50 rows=5718 width=336)
(actual time=321.000..321.000 rows=0 loops=1)
-> Seq Scan on doccontact dcfrom
(cost=0.00..1450.50 rows=5718 width=336) (actual time=10.000..291.000
rows=5718 loops=1)
Filter: ((dc_doccontacttype)
::text =
'FROM'::text)
-> Hash (cost=827.41..827.41 rows=5741 width=50) (actual
time=40.000..40.000 rows=0 loops=1)
-> Seq Scan on document root (cost=0.00..827.41
rows=5741 width=50) (actual time=0.000..30.000 rows=5719 loops=1)
-> Hash (cost=827.41..827.41 rows=5741 width=64) (actual
time=210.000..210.000 rows=0 loops=1)
-> Seq Scan on document parentom (cost=0.00..827.41
rows=5741 width=64) (actual time=0.000..160.000 rows=5719 loops=1)
-> Hash (cost=494.73..494.73 rows=50 width=42) (actual
time=261.000..261.000 rows=0 loops=1)
-> Subquery Scan "IN_subquery" (cost=185.13..494.73 rows=50
width=42) (actual time=101.000..261.000 rows=50 loops=1)
-> Limit (cost=185.13..494.23 rows=50 width=40) (actual
time=101.000..261.000 rows=50 loops=1)
-> Nested Loop Left Join (cost=185.13..35676.18
rows=5741 width=40) (actual time=101.000..261.000 rows=50 loops=1)
-> Hash Left Join (cost=185.13..1098.65
rows=5741 width=80) (actual time=91.000..91.000 rows=50 loops=1)
Hash Cond: (("outer". doc_documentid)::tex
t
= ("inner". doc_documentid)::tex
t)
-> Seq Scan on document doc
(cost=0.00..827.41 rows=5741 width=80) (actual time=10.000..10.000 rows=50
loops=1)
-> Hash (cost=170.70..170.70 rows=5770
width=40) (actual time=81.000..81.000 rows=0 loops=1)
-> Seq Scan on folder_document
(cost=0.00..170.70 rows=5770 width=40) (actual time=10.000..61.000 rows=5719
loops=1)
-> Index Scan using pk_document on document root
(cost=0.00..6.01 rows=1 width=40) (actual time=3.400..3.400 rows=1 loops=50)
Index Cond:
(("outer". doc_internalrootxref
)::text = (root. doc_documentid)::tex
t)
SubPlan
-> Limit (cost=0.00..1.96 rows=1 width=40) (actual time=0.400..0.400
rows=0 loops=50)
-> Seq Scan on followup_document fd (cost=0.00..3.91 rows=2
width=40) (actual time=0.400..0.400 rows=0 loops=50)
Filter: (($1)::text = (doc_documentid)::te
xt)
-> Limit (cost=0.00..6.01 rows=1 width=40) (actual
time=17.620..17.620 rows=0 loops=50)
-> Result (cost=0.00..6.01 rows=1 width=40) (actual
time=17.620..17.620 rows=0 loops=50)
One-Time Filter: ($0 = true)
-> Index Scan using documentevent_index on documentevent
de (cost=0.00..6.01 rows=1 width=40) (actual time=28.419..28.419 rows=0
loops=31)
Index Cond: ((($1)::text = (doc_documentid)::te
xt)
AND (de_processedflag = false) AND ((de_documenteventst
atus)::text =
'ERROR'::text))
Total runtime: 5558.000 ms


---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings




---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

Tom Lane

2005-09-22, 3:24 am

"Gurpreet Aulakh" <gaulakh@ecmarket.com> writes:
> What is really interesting is the time it takes for the Hash to occur. For
> the first hash, on the 7.3 it takes only 12ms while on the 8.0 it takes
> 47ms.


You haven't told us a thing about the column datatypes involved (much
less what the query actually is) ... but I wonder if this is a textual
datatype and the 8.0 installation is using a non-C locale where the 7.3
installation is using C locale. That could account for a considerable
slowdown in text comparison speeds.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Gurpreet Aulakh

2005-09-22, 8:25 pm

Hi,

Here is the information that you requested.

The sub query that I am using is

EXPLAIN ANALYZE SELECT doc.doc_documentid FROM document AS doc
LEFT JOIN document as root
ON doc. doc_internalRootXref
= root.doc_documentId
LEFT JOIN folder_document ON doc.doc_documentid =
folder_document.doc_documentId
LIMIT 500 OFFSET 0


The column doc_documentid is character varying(48) on both tables (document,
folder_document).
The column doc_internalRootXref
is also character varying(48)
doc_documentid and doc_internalRootXref
are UUIDs that is 36 chars long.

The document table has 58 columns.
31 columns are varchar ranging from size 8 to 80
7 booleans
4 numeric(12,2)
8 timestamp with time zone
1 integer
1 bigint
5 text

The folder_documen table has 6 columns
4 varchar (2 of length 16 2 of length 48)

The following indexes are on the document table
pk_document primary key btree (doc_documentid),
document_pk unique btree (doc_documentid),
doc_deliverydate_ind
ex btree (doc_deliverydate),
doc_externalxref_ind
ex btree (doc_externalxref),
doc_internalparentom
xref_index btree (doc_internalparento
mxref),
doc_internalrootxref
_index btree (doc_internalrootxre
f)
The following indexes are on the folder_document table
pk_folder_document primary key btree (doc_documentid)
fk_folder_document1 FOREIGN KEY (fld_folderid) REFERENCES
folder(fld_folderid)

ON UPDATE RESTRICT ON DELETE CASCADE,
fk_folder_document2 FOREIGN KEY (doc_documentid) REFERENCES
document(doc_documen
tid)
ON UPDATE RESTRICT ON DELETE CASCADE

After reading your hint about locale settings, I reinstalled postgres and
made sure the locale was set
to C and that the encoding was SQL_ASCII. (these are the settings on the
cygwin installation).

I still get the same results in the last post.


-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: September 21, 2005 8:13 PM
To: Gurpreet Aulakh
Cc: pgsql- performance@postgres
ql.org
Subject: Re: [PERFORM] Query slower on 8.0.3 (Windows) vs 7.3 (cygwin)


"Gurpreet Aulakh" <gaulakh@ecmarket.com> writes:
> What is really interesting is the time it takes for the Hash to occur. For
> the first hash, on the 7.3 it takes only 12ms while on the 8.0 it takes
> 47ms.


You haven't told us a thing about the column datatypes involved (much
less what the query actually is) ... but I wonder if this is a textual
datatype and the 8.0 installation is using a non-C locale where the 7.3
installation is using C locale. That could account for a considerable
slowdown in text comparison speeds.

regards, tom lane




---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Gurpreet Aulakh

2005-09-23, 1:23 pm

After further investigation I have found that the reason why the query is
slower on 8.0.3 is that the hash and hash joins are slower on the 8.0.3.

So the question comes down to : Why are hash and hash joins slower? Is this
a postgres configuration setting that I am missing? Is the locale still
screwing me up? I have set the locale to 'C' without any improvements. Is it
because the column type is a varchar that the hash is slower?




---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Tom Lane

2005-09-23, 8:24 pm

"Gurpreet Aulakh" <gaulakh@ecmarket.com> writes:
> After further investigation I have found that the reason why the query is
> slower on 8.0.3 is that the hash and hash joins are slower on the 8.0.3.
> So the question comes down to : Why are hash and hash joins slower?


I looked into this a bit and determined that the problem seems to have
been introduced here:

2002-12-30 10:21 tgl

* src/: backend/executor/nodeHash.c,
backend/executor/nodeHashjoin.c, backend/optimizer/path/costsize.c,
include/executor/nodeHash.h: Better solution to integer overflow
problem in hash batch-number computation: reduce the bucket number
mod nbatch. This changes the association between original bucket
numbers and batches, but that doesn't matter. Minor other cleanups
in hashjoin code to help centralize decisions.

(which means it's present in 7.4 as well as 8.0). The code now
groups tuples into hash batches according to
(hashvalue % totalbuckets) % nbatch
When a tuple that is not in the first batch is reloaded, it is placed
into a bucket according to
(hashvalue % nbuckets)
This means that if totalbuckets, nbatch, and nbuckets have a common
factor F, the buckets won't be evenly used; in fact, only one in every F
buckets will be used at all, the rest remaining empty. The ones that
are used accordingly will contain about F times more tuples than
intended. The slowdown comes from having to compare these extra tuples
against the outer-relation tuples.

7.3 uses a different algorithm for grouping tuples that avoids this
problem, but it has performance issues of its own (in particular, to
avoid integer overflow we have to limit the number of batches we can
have). So just reverting this patch doesn't seem very attractive.

The problem no longer exists in 8.1 because of rewrites undertaken for
another purpose, so I'm sort of tempted to do nothing. To fix this in
the back branches we'd have to develop new code that won't ever go into
CVS tip and thus will never get beta-tested. The risk of breaking
things seems higher than I'd like.

If we did want to fix it, my first idea is to increment nbatch looking
for a value that has no common factor with nbuckets.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Gurpreet Aulakh

2005-09-26, 1:24 pm

Thanks for your help Tom.

While testing 8.1, I found that simple joins take longer in 8.1 than 8.0.
For example the sub query

SELECT doc.doc_documentid FROM document AS doc LEFT JOIN folder_document ON
doc.doc_documentid = folder_document.doc_documentId LEFT JOIN document as
root ON doc. doc_internalRootXref
= root.doc_documentId

is actually slower on 8.1 than 8.0.

However, the full query that I will be running is much faster. In my
evaluation I found the same pattern. That simple joins were slower but
complex joins were faster.

Overall though, 8.1 is faster and we will probably be moving to it when it's
officially released.

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: September 23, 2005 2:13 PM
To: Gurpreet Aulakh
Cc: pgsql- performance@postgres
ql.org
Subject: Re: [PERFORM] Query slower on 8.0.3 (Windows) vs 7.3 (cygwin)


"Gurpreet Aulakh" <gaulakh@ecmarket.com> writes:
> After further investigation I have found that the reason why the query is
> slower on 8.0.3 is that the hash and hash joins are slower on the 8.0.3.
> So the question comes down to : Why are hash and hash joins slower?


I looked into this a bit and determined that the problem seems to have
been introduced here:

2002-12-30 10:21 tgl

* src/: backend/executor/nodeHash.c,
backend/executor/nodeHashjoin.c, backend/optimizer/path/costsize.c,
include/executor/nodeHash.h: Better solution to integer overflow
problem in hash batch-number computation: reduce the bucket number
mod nbatch. This changes the association between original bucket
numbers and batches, but that doesn't matter. Minor other cleanups
in hashjoin code to help centralize decisions.

(which means it's present in 7.4 as well as 8.0). The code now
groups tuples into hash batches according to
(hashvalue % totalbuckets) % nbatch
When a tuple that is not in the first batch is reloaded, it is placed
into a bucket according to
(hashvalue % nbuckets)
This means that if totalbuckets, nbatch, and nbuckets have a common
factor F, the buckets won't be evenly used; in fact, only one in every F
buckets will be used at all, the rest remaining empty. The ones that
are used accordingly will contain about F times more tuples than
intended. The slowdown comes from having to compare these extra tuples
against the outer-relation tuples.

7.3 uses a different algorithm for grouping tuples that avoids this
problem, but it has performance issues of its own (in particular, to
avoid integer overflow we have to limit the number of batches we can
have). So just reverting this patch doesn't seem very attractive.

The problem no longer exists in 8.1 because of rewrites undertaken for
another purpose, so I'm sort of tempted to do nothing. To fix this in
the back branches we'd have to develop new code that won't ever go into
CVS tip and thus will never get beta-tested. The risk of breaking
things seems higher than I'd like.

If we did want to fix it, my first idea is to increment nbatch looking
for a value that has no common factor with nbuckets.

regards, tom lane




---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Tom Lane

2005-09-26, 8:25 pm

"Gurpreet Aulakh" <gaulakh@ecmarket.com> writes:
> While testing 8.1, I found that simple joins take longer in 8.1 than 8.0.
> For example the sub query
> SELECT doc.doc_documentid FROM document AS doc LEFT JOIN folder_document ON
> doc.doc_documentid = folder_document.doc_documentId LEFT JOIN document as
> root ON doc. doc_internalRootXref
= root.doc_documentId
> is actually slower on 8.1 than 8.0.


With no more detail than that, this report is utterly unhelpful. Let's
see the table schemas and the EXPLAIN ANALYZE results in both cases.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql
.org so that your
message can get through to the mailing list cleanly

Sponsored Links





Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive | Programming forum archive

Copyright 2008 droptable.com