Home > Archive > PostgeSQL ODBC > February 2006 > Slow query through ODBC









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Slow query through ODBC
Arnaud Lesauvage

2006-01-23, 7:23 am

Hi all !

I have a query that runs very fast when executed through pgAdmin
(~2 seconds), but very slowly through the ODBC driver (~30
seconds). Both queries are executed from the same workstation on
the same server.

My workstation is a WinXP Pro, and I use the latest psqlODBC
driver available as an MSI package (version 8.01.0102).
The server is a PostGreSQL 8.1.2, running on a Win2000 server.

The query is very simple, it returns about 1500 lines, made of
only one text field, but this field can be very large (it is the
WKT representation of a spatial object from PostGIS).

I am connecting to the server in a VBA project with a simple
connectionstring with only default options (I have not found a
document explaining how to fine tune the odbc driver).
My connectionstring is :
"DRIVER={PostgreSQL
Unicode};SERVER=myse
rver;PORT=5432;DATAB
ASE=mydatabase;UID=m
yuser;PWD=mypass"

I assume that the problem is with the very large field size that
is returned, but what should I change in my connectionstring to
make this query run faster ?

Thanks for your help !

--
Arnaud



---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Ludek Finstrle

2006-01-23, 7:23 am

> My workstation is a WinXP Pro, and I use the latest psqlODBC
> driver available as an MSI package (version 8.01.0102).


Don't you have problem with geting whole data correctly? There
is bug in psqlodbc driver when reading data from text field larger
then 2048 (+-) bytes (in Unicode driver). The problem was fixed
in 08.01.0106 development snapshot.

> The server is a PostGreSQL 8.1.2, running on a Win2000 server.
>
> The query is very simple, it returns about 1500 lines, made of


Are you sure the PgAdmin returns all rows (I'm not). Maybe Dave
give us the answer.

> only one text field, but this field can be very large (it is the
> WKT representation of a spatial object from PostGIS).
>
> I am connecting to the server in a VBA project with a simple
> connectionstring with only default options (I have not found a
> document explaining how to fine tune the odbc driver).
> My connectionstring is :
> "DRIVER={PostgreSQL
> Unicode};SERVER=myse
rver;PORT=5432;DATAB
ASE=mydatabase;UID=m
yuser;PWD=mypass"
>
> I assume that the problem is with the very large field size that
> is returned, but what should I change in my connectionstring to
> make this query run faster ?


I don't know it exactly. Could you try experiment with parameters
such Use declare/fetch or Server side prepare?
It could help me when you send me the mylog output (enable the mylog
output under Global and DataSource button).

I hope the data isn't confidental (they are listed in mylog output).

Regards,

Luf

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Arnaud Lesauvage

2006-01-23, 7:23 am

Ludek Finstrle a écrit :
>
> Don't you have problem with geting whole data correctly? There
> is bug in psqlodbc driver when reading data from text field larger
> then 2048 (+-) bytes (in Unicode driver). The problem was fixed
> in 08.01.0106 development snapshot.


No, the data looks good. I am drawing the spatial objects with the
query results, and my code would throw an error if some data was
incorrect.

>
> Are you sure the PgAdmin returns all rows (I'm not). Maybe Dave
> give us the answer.


Yes. Actually, pgAdmin asks me if I want all rows (default is to
return 100 rows). I answer yes and the result is almost immediate.
If I ask for an output in a text file, the query runs in ~2
seconds also.

>
> I don't know it exactly. Could you try experiment with parameters
> such Use declare/fetch or Server side prepare?
> It could help me when you send me the mylog output (enable the mylog
> output under Global and DataSource button).


I will try with these parameters, but how do I add them to the
connectionstring ?
Should I just append something like ";usedeclarefecth=1" at the
end of it ? I can't find any documentation on this (maybe I am not
searching at the right place...).

> I hope the data isn't confidental (they are listed in mylog output).


Unfortunately the data is confidential.
If we can't find a simple solution, I might just remove the data
from the log file ?


Thanks for helping me on this !


--
Arnaud
--
Arnaud


---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Ludek Finstrle

2006-01-23, 7:23 am

> >Don't you have problem with geting whole data correctly? There

>
> No, the data looks good. I am drawing the spatial objects with the
> query results, and my code would throw an error if some data was
> incorrect.


Ok. I describe the problem better. The problem was that last part
of each row returned from psqlodbc are random data from memory.
So if the data for one column (and row) is larger then N bytes (some
users has problem with N = 2048 bytes) the problem raised.
It means that only last +- 1 .. N bytes for each row are filled
randomely from memory.
You didn't must run in the problem if the data are fetched in one call
of SQLGetData. I only want to notice it. It could help you in the
future ;-)

>
> I will try with these parameters, but how do I add them to the
> connectionstring ?
> Should I just append something like ";usedeclarefecth=1" at the
> end of it ? I can't find any documentation on this (maybe I am not
> searching at the right place...).


I'm sorry I don't know it exactly (I use ODBC manager).
Maybe you could configure it through ODBC manager ...

I try to search it in source code and it could be (fullname and
shortcut):
;UseDeclareFetch=1
;B6=1

;UseServerSidePrepar
e=1
;C8=1

Maybe it's case insensitive. I don't know it exactly.
Don't use it both at the time. You only waste your time if you try it.

>
> Unfortunately the data is confidential.
> If we can't find a simple solution, I might just remove the data
> from the log file ?


You could remove the data from log (only keep there a note that data
is fetched and the data length could be useful too).
You want to seatch CC_mapping and PGAPI_GetData in the log.
There is at least one line per row. Maybe it could be easier when
you replace your data with 'x' or something like that.

If I can't find the problem in mylog output could you create example
data?

Regards,

Luf

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Arnaud Lesauvage

2006-01-23, 7:23 am

Ludek Finstrle a écrit :
>
> Ok. I describe the problem better. The problem was that last part
> of each row returned from psqlodbc are random data from memory.
> So if the data for one column (and row) is larger then N bytes (some
> users has problem with N = 2048 bytes) the problem raised.
> It means that only last +- 1 .. N bytes for each row are filled
> randomely from memory.
> You didn't must run in the problem if the data are fetched in one call
> of SQLGetData. I only want to notice it. It could help you in the
> future ;-)


Good point Ludek. I double checked my data, and it definitively
good in the output.
FYI, the largest field returned is 23581 characters long. The
dataset is ~1.6MB long (so we have an average of 1kB per row).

>
> I'm sorry I don't know it exactly (I use ODBC manager).
> Maybe you could configure it through ODBC manager ...
>
> I try to search it in source code and it could be (fullname and
> shortcut):
> ;UseDeclareFetch=1
> ;B6=1
>
> ;UseServerSidePrepar
e=1
> ;C8=1
>
> Maybe it's case insensitive. I don't know it exactly.
> Don't use it both at the time. You only waste your time if you try it.


I tried both parameters (they look OK because ADO did not throw an
error), but they did not help.
Maybe I should add that the query is stored in an ADO recordset,
which is read-only and forward-only ?
It is the .open method of this recordset that actually takes 30
seconds to run.

> You could remove the data from log (only keep there a note that data
> is fetched and the data length could be useful too).
> You want to seatch CC_mapping and PGAPI_GetData in the log.
> There is at least one line per row. Maybe it could be easier when
> you replace your data with 'x' or something like that.
>
> If I can't find the problem in mylog output could you create example
> data?


I sure could !
I'll try some other parameters before (I will use the File-DSN
panel to generate connectionstrings, I should have thought about
it before !), and I'll tell you whether I found something or not.


--
Arnaud


---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Ludek Finstrle

2006-01-23, 9:23 am

> I tried both parameters (they look OK because ADO did not throw an
> error), but they did not help.


I think it. But we could test it at first :-)

> Maybe I should add that the query is stored in an ADO recordset,
> which is read-only and forward-only ?


This is the best situation for psqlODBC driver.

> It is the .open method of this recordset that actually takes 30
> seconds to run.


..open ADO contains a lot of SQL* commands for ODBC.
I need the mylog or (the better for me) example data to reproduce it
here.

>
> I sure could !


Great.

> I'll try some other parameters before (I will use the File-DSN
> panel to generate connectionstrings, I should have thought about
> it before !), and I'll tell you whether I found something or not.


Ok. I'm looking for note from you.
If you'll not find the way please send me the parsed log or
example data with SELECT.

Regards,

Luf

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Ludek Finstrle

2006-01-24, 8:23 pm

Tue, Jan 24, 2006 at 04:47:04PM +0100, Arnaud Lesauvage wrote:
> I went on with my investigations.
>
> I now believe that the problem lies within the network dialog
> (between my client and my server).
> If I transfer this table on my workstation (I have a local
> postgresql installation too), the query runs blindingly fast (less
> than 1 sec).
> I don't understand what the problem could be though.
>
> This thread in the mailing list archive looks like the problem
> mentionned is the same as mine, but they don't really mention a fix :
> http://archives.postgresql.org/pgsq...06/msg00593.php
>
> My server is on the LAN, so bandwidth cannot be a problem.
> Furthermore, I just migrated from a MySQL 5.0 server. The exact
> same query on this MySQL server, through MyODBC, took less than 1
> second too, so my network is not the problem.
>
> Are there specific client/server protocol tuning parameters ?


I know about nothing like that.

> Have you ever heard of such issues ?


No, I haven't heard it with 08.01 yet.

Regards,

Luf

P.S. I reply mainly for Cc: pgsql-odbc - more heads more knowledge ;-)

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Dave Page

2006-01-26, 4:54 pm



> -----Original Message-----
> From: Ludek Finstrle & #91;mailto:luf@pzkag
is.cz]
> Sent: 24 January 2006 22:35
> To: Arnaud Lesauvage
> Cc: Dave Page
> Subject: Re: [ODBC] Slow query through ODBC
>
> It seems to me I have to kick Dave to take a look at this :-)
> Dave is pgAdmin developer so he could give a light on the problem.
>
> Dave: Could you take a look at this problem?


Arnaud,

When you are testing, please make sure that the psqlODBC's mylog and
commlog are switched off, and that SQL Tracing is turned off in the ODBC
control panel applet.

Also, please note that pgAdmin has very lightweight data access classes
written over libpq - much more lightweight than the ODBC driver. Most of
the overhead in pgAdmin comes from rendering the data in the grid
control.

Regards, Dave.

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Arnaud Lesauvage

2006-01-30, 3:23 am

Dave Page a écrit :
>
> Arnaud,
>
> When you are testing, please make sure that the psqlODBC's mylog and
> commlog are switched off, and that SQL Tracing is turned off in the ODBC
> control panel applet.
>
> Also, please note that pgAdmin has very lightweight data access classes
> written over libpq - much more lightweight than the ODBC driver. Most of
> the overhead in pgAdmin comes from rendering the data in the grid
> control.
>
> Regards, Dave.


Dave,

All logging is off when I run tests. I only turned them on on
Ludek's request for debugging purposes. Tracing is always off in
the ODBC pannel.

What looks strange to me is that while the query is running
(during the "open" method of the recordset), processor usage is 0%
on the server and the workstation. Network usage is very low :
~50kB on a 100Mbps connection, during the entire query.

In pgAdmin, the query is immediate and there is just a pike in the
network traffic when the data is received. That's what I was
expecting to see through psqlODBC also...

Regards
--
Arnaud


---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Arnaud Lesauvage

2006-02-10, 7:23 am

Hi List !

I have some news about this problem !
Just to remind you the facts :
A query runs very slowly when executed through the psqlodbc
driver, but very fast when executed through psql or pgAdmin.
The query returns very long rows containing only one field, but
this field can be as long as 40.000 characters (it is a text field).

I posted a TCP/IP dump of the dialog between my workstation and
the postgresql server to a TCP/IP newsgroup.
I fixed the problem by changing a parameter in my TCP/IP
configuration (on my workstation, not on the server).
I have set the TcpAckFrequency key to 1, as suggested by someone
on the comp.protocols.tcp-ip NG. The procedure is described in :
http://support.microsoft.com/?scid=...28890&x=15&y=10

*But* the interesting thing is that some people suggested that the
problem was lying in the way that the server and the client
talked to each other.

Let me first post the explanation from Bill Meier, on the
Ethereal-users mailing-list, then the one from Chris Marget on the
comp.protocols.tcp-ip newsgroup.

Bill Meier wrote :
> The short answer:
>
> "TCP_NODELAY" should be enabled on the TCP connection to
> the database on both the client and server side of the connection.
> (This is also known as "disabling the Nagle Algorithm").
> (Based upon your capture I can almost guarantee that TCP_NODELAY is
> *not* enabled on your Database TCP connection on the server side).
>
> The use of TCP_NODELAY is a database option for a different database server with
> which I'm familiar. I suggest you consult your DBA with respect to PostgreSQL.
> (Also Google "database tcp_nodelay" & etc for information).
>
> The longer answer:
>
> Delays may be observerd in client/server query/response types
> of applications over a TCP connection with the Nagle Algorithm enabled.
>
> The presence or absence of a delay is dependent upon the length of
> the query and/or the query response; this causes much confusion when trying
> to analyze delay problems because the delays will seem to "come and go"
> depending upon the exact length of the query and/or response.
>
> There is also much confusion as to the exact nature of the problem.
>
> I believe the current edition of Stevens' "TCPIP Illustrated" gives a pretty clear
> explanation of the issue.
>
>
> The details in your case:
>
> I believe that due to the Nagle Algorithm, for the specific query reponse
> shown in the capture, each time your server sends a "short" packet,
> the server waits for an ACK from the client before sending the next packet.
> In your case, the server application is sending the response in chunks of
> 8192 data bytes which results in a "short" packet every 6 packets).
>
> (I would suspect that a capture of a query/response from one of the
> workstations on which the query runs "very fast" may show that the query response
> is of a different length).



Chris Marget wrote :
>
> Are these two applications hitting different socket code on the server
> end?
>
> I have an area for you to explore: There's a method of socket
> programming in windows called "io completion ports" (or somesuch).
> Rather than streaming data into the socket UNIX style, the application
> code allocates a buffer somewhere, fills it and then tells the stack:
> "The buffer is here, please send it, then let me know when I can have
> the buffer back."
>
> Some see an advantage to this type of coding because it eliminates a
> copy of data from the application memory area into stack memory area.
>
> The problem comes from the fact that the buffer cannot be reused by the
> application until the receiving TCP has ACKed all the data.
>
> Here's what I think is happening:
>
> Your server is using one of these zero-copy stack methods. It needs to
> send 44078 bytes to the client. The buffer I mentioned is only 8KB, so
> it must be reused 6 times in order to send all 44078 bytes.
>
> So, 8KB is copied in and the stack is asked to send.
> Bad luck causes it to take ~200ms to receive ACKs for all that data
> because of your stack tuning and timing issues.
>
> The buffer becomes available again and another 8KB is copied in.
> ~200ms again.
>
> This cycle repeats 6 times until all 44078 bytes are send and ACKed.
> Nearly 1 second has elapsed.
>
> A smarter application on the server would have used a bigger buffer.
>
> An even smarter application would have used many buffers. 8KB buffers
> are okay if you fill buffer A, notify the stack, fill buffer B, notify
> the stack, etc...
>
> Something else entirely may be going on, but this is my guess.



Both these explanations are far beyond my knowledge !
I am quite confident though that the developers of the libpq.dll
will understand what this is all about.

Was this information of some help to you ?

Best regards,
--
Arnaud



---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Magnus Hagander

2006-02-10, 7:23 am

Since it works with different speeds in different clients, it *probably*
isn't on the server side. I'm far from sure on that though, there may be
other paramters that are implicitly changed when ODBC is used.

It's interesting to note that your problem is with a single large field.
If you query for approximatly the same amount of data *but in several
small fields*, do you get the same behaviour? Or does it happen only
with large fields?


> I posted a TCP/IP dump of the dialog between my workstation
> and the postgresql server to a TCP/IP newsgroup.
> I fixed the problem by changing a parameter in my TCP/IP
> configuration (on my workstation, not on the server).
> I have set the TcpAckFrequency key to 1, as suggested by
> someone on the comp.protocols.tcp-ip NG. The procedure is
> described in :
> http://support.microsoft.com/?scid=...28890&x=15&y=10


Interesting - I wouldn't have expected that to make a difference.

<snip>


> the database
> database
> DBA with respect to PostgreSQL.

We enable TCP_NODELAY on both server and client, when available. And it
is available - I've double-checked and that code is indeed compiled into
both libpq and the server on 8.1 at least.

[color=darkred]
> application
> the stack:
> I can have
> eliminates a
> memory area.


We don't use this.
We do, however, buffer it in an application side driver and then send it
off with a single call to send(). On the server, that is. Perhaps it can
have a similar effect.

But I don't see how this can make a difference between pgadmin and odbc
and psql, since they all use the same code!
Might be interesting to write up a small test program that uses just
libpq and see what is required to repro the problem there.


//Magnus

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

Arnaud Lesauvage

2006-02-10, 7:23 am

Magnus Hagander a écrit :
> Since it works with different speeds in different clients, it *probably*
> isn't on the server side. I'm far from sure on that though, there may be
> other paramters that are implicitly changed when ODBC is used.


Yes, that was my first impression also. But the fact that the
server waits for an ACK from the client before it goes on sending
packet is quite strange, isn't it ?

> It's interesting to note that your problem is with a single large field.
> If you query for approximatly the same amount of data *but in several
> small fields*, do you get the same behaviour? Or does it happen only
> with large fields?


I have splitted the data into 4 fields.
The largest field was ~23.000 characters long, so I made 4 fields
containing. The first one contains the first 6000, the second one
from 6001 to 12000, etc...
The query took about half the time to run !

> We enable TCP_NODELAY on both server and client, when available. And it
> is available - I've double-checked and that code is indeed compiled into
> both libpq and the server on 8.1 at least.


I run 8.1.2, so I assume that it is enabled on my workstation and
on the server.

> But I don't see how this can make a difference between pgadmin and odbc
> and psql, since they all use the same code!


Yes, I really don't understand either !
But even though all use libpq.dll, they might not handle the data
the same way. Mayboe ODBC needs to check for data type, field
length, etc... and the other frontends do not need this kind of
information ?

Regards
--
Arnaud


---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql
.org so that your
message can get through to the mailing list cleanly

Magnus Hagander

2006-02-10, 7:23 am

> Magnus Hagander a écrit :
> that though,
> when ODBC is used.
>
> Yes, that was my first impression also. But the fact that the
> server waits for an ACK from the client before it goes on
> sending packet is quite strange, isn't it ?


Yes. Most definitly.


> large field.
> in several
> happen only
>
> I have splitted the data into 4 fields.
> The largest field was ~23.000 characters long, so I made 4
> fields containing. The first one contains the first 6000, the
> second one from 6001 to 12000, etc...
> The query took about half the time to run !


Interesting. Did it make any difference to the timings in psql and/or pgadmin?


> available. And
> indeed compiled
>
> I run 8.1.2, so I assume that it is enabled on my workstation
> and on the server.


Yup.


>
> Yes, I really don't understand either !
> But even though all use libpq.dll, they might not handle the
> data the same way. Mayboe ODBC needs to check for data type,
> field length, etc... and the other frontends do not need this
> kind of information ?


Hm. That's an interesting observation. When you do a query with ODBC, does it do any "extra queries" automatically on the system tables? Enable statement logging on the server and see if something weird shows up.

//Magnus

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

Arnaud Lesauvage

2006-02-10, 7:23 am

>> I have splitted the data into 4 fields.
>
> Interesting. Did it make any difference to the timings in psql and/or pgadmin?


No, not at all. Actually, it even runs a tiny bit slower in
pgAdmin (2.05 sec vs 1.95 for the "one large field" version).


>
> Hm. That's an interesting observation. When you do a query with ODBC, does it do any "extra queries" automatically on the system tables? Enable statement logging on the server and see if something weird shows up.


No, no query on the system tables.
I am not very familiar with the log though, maybe Ludek could tell
us more about that ?

Regards
--
Arnaud


---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Ludek Finstrle

2006-02-10, 7:23 am

> >>Yes, I really don't understand either !
>
> No, no query on the system tables.
> I am not very familiar with the log though, maybe Ludek could tell
> us more about that ?


If I remember it right we measure the time problem in LIBPQ_send_query
function in connection.c (I'm not sure with filename).
Magnus, could you take a look at the code? Maybe you find the problem.
I'm sorry I have no time for it today.
BTW we change the code between 08.01.0102 and 08.01.0200 from PQexec
to PQsendquery and PQgetresult (I write this from my head so it could
be little different names). But I think Arnaud has problem even with
08.01.0102.

Regards,

Luf

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql
.org so that your
message can get through to the mailing list cleanly

Magnus Hagander

2006-02-10, 7:23 am

> > >>Yes, I really don't understand either !
> with ODBC,
> something weird shows up.
> could tell us
>
> If I remember it right we measure the time problem in
> LIBPQ_send_query function in connection.c (I'm not sure with
> filename).
> Magnus, could you take a look at the code? Maybe you find the problem.
> I'm sorry I have no time for it today.
> BTW we change the code between 08.01.0102 and 08.01.0200 from
> PQexec to PQsendquery and PQgetresult (I write this from my
> head so it could be little different names). But I think
> Arnaud has problem even with 08.01.0102.


That's very interestnig information. Because that was one thing that was
different.

Arnaud, any chance you can giev it a try with 8.01.0200?

//Magnus

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Ludek Finstrle

2006-02-10, 7:23 am

> > If I remember it right we measure the time problem in
>
> That's very interestnig information. Because that was one thing that was
> different.
>
> Arnaud, any chance you can giev it a try with 8.01.0200?


Please try both, 08.01.0102 and 08.01.0200. Does it differ?

Thanks,

Luf

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql
.org so that your
message can get through to the mailing list cleanly

Arnaud Lesauvage

2006-02-10, 9:23 am

Ludek Finstrle a écrit :
>
> Please try both, 08.01.0102 and 08.01.0200. Does it differ?


I just tried both versions and there is absolutely no difference.

I'll stick with .0200 of course !

Regards
--
Arnaud


---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Sponsored Links





Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive | Programming forum archive

Copyright 2008 droptable.com