Drop Table

Support Forum for database administrators and web based access to important newsgroups related to databases
Register on Database Support Forum Edit your profileCalendarFind other Database Support forum membersFrequently Asked QuestionsSearch this forum -> 
For Database admins: Free Database-related Magazines Now Free shipping to Texas


Post New Thread










Thread
Author

Very slow distinct select
My table looks like this:
char(150) HTTP_REF,
char(250) HTTP_USER,
char(150) REMOTE_ADDR,
char(150) REMOTE_HOST,
char(150) URL,
smalldatetime TIME_STAMP

There are no indexes on this table and there are only 293,658 records total.

When I do a select like this it takes forever:

SELECT COUNT(DISTINCT REMOTE_ADDR)

Takes 2 minutes. Is there anyway to speed that up?

Thanks



Report this thread to moderator Post Follow-up to this message
Old Post
Rich
07-26-05 01:29 AM


Re: Very slow distinct select
You can speed up this query with a non-clustered index on REMOTE_ADDR.

--
Hope this helps.

Dan Guzman
SQL Server MVP

"Rich" <no@spam.invalid> wrote in message
news:DbfFe.48792$4o.35562@fed1read06...
> My table looks like this:
> char(150) HTTP_REF,
> char(250) HTTP_USER,
> char(150) REMOTE_ADDR,
> char(150) REMOTE_HOST,
> char(150) URL,
> smalldatetime TIME_STAMP
>
> There are no indexes on this table and there are only 293,658 records
> total.
>
> When I do a select like this it takes forever:
>
> SELECT COUNT(DISTINCT REMOTE_ADDR)
>
> Takes 2 minutes. Is there anyway to speed that up?
>
> Thanks
>
>



Report this thread to moderator Post Follow-up to this message
Old Post
Dan Guzman
07-26-05 01:29 AM


Re: Very slow distinct select
REMOTE_ADDR has duplicate values. So I added a numeric identity column and
did this:

CREATE UNIQUE NONCLUSTERED INDEX Requests_RemoteAddr ON Requests
(REMOTE_ADDR, ID)

But that didn't seem to help the speed any.

Any other ideas?

Thanks.


"Dan Guzman" <guzmanda@nospam-online.sbcglobal.net> wrote in message
news:09gFe.2034$PC7.1092@newssvr29.news.prodigy.net...
> You can speed up this query with a non-clustered index on REMOTE_ADDR.
>
> --
> Hope this helps.
>
> Dan Guzman
> SQL Server MVP
>
> "Rich" <no@spam.invalid> wrote in message
> news:DbfFe.48792$4o.35562@fed1read06... 
>
>



Report this thread to moderator Post Follow-up to this message
Old Post
Rich
07-26-05 01:29 AM


Re: Very slow distinct select
Please post DDL, so that people do not have to guess what the keys,
constraints, Declarative Referential Integrity, data types, etc. in
your schema are. Sample data is also a good idea, along with clear
specifications.  It is very hard to debug code when you do not let us
see it.

You did not so much as tell us the **name* of this table!  There are
not indexes?

Does your boss -- who is paying you, unlike us who are doing your work
for you for free -- give sooooooo little to work with?


Report this thread to moderator Post Follow-up to this message
Old Post
--CELKO--
07-26-05 01:29 AM


Re: Very slow distinct select
What the heck is DDL???
Didn't I give enough info in my two posts?
And what does the name matter?


"--CELKO--" <jcelko212@earthlink.net> wrote in message
news:1122342953.732318.259740@g44g2000cwa.googlegroups.com...
> Please post DDL, so that people do not have to guess what the keys,
> constraints, Declarative Referential Integrity, data types, etc. in
> your schema are. Sample data is also a good idea, along with clear
> specifications.  It is very hard to debug code when you do not let us
> see it.
>
> You did not so much as tell us the **name* of this table!  There are
> not indexes?
>
> Does your boss -- who is paying you, unlike us who are doing your work
> for you for free -- give sooooooo little to work with?
>



Report this thread to moderator Post Follow-up to this message
Old Post
Rich
07-26-05 12:23 PM


Re: Very slow distinct select
> REMOTE_ADDR has duplicate values. So I added a numeric identity column

A non-unique index will optimize your COUNT(DISTINCT REMOTE_ADDR).  No need
to add a unique-ifier identity column.

How many distinct values?  If it is fairly high, a scan of many index leaf
nodes is needed and this might not perform much better than a table scan.
In that case, you can create a view with the aggregation and then create an
index on that view.  For Example

CREATE VIEW Requests_RemoteAddr
WITH SCHEMABINDING AS
SELECT REMOTE_ADDR, COUNT_BIG(*) AS CoungBig
FROM dbo.Requests
GROUP BY REMOTE_ADDR
GO

CREATE UNIQUE CLUSTERED INDEX  Requests_RemoteAddr_
Index ON
WebLog(REMOTE_ADDR)
GO

Your query can automatically use the view index when accessing the base
table if you are using SQL 2000 Enterprise Edition.  If you are using
another edition, you'll need to access the view and add the NOEXPAND hint:

SELECT COUNT(DISTINCT REMOTE_ADDR)
FROM dbo.Results_View WITH (NOEXPAND)

--
Hope this helps.

Dan Guzman
SQL Server MVP

"Rich" <no@spam.invalid> wrote in message
news:IBgFe.48800$4o.29531@fed1read06...
> REMOTE_ADDR has duplicate values. So I added a numeric identity column and
> did this:
>
> CREATE UNIQUE NONCLUSTERED INDEX Requests_RemoteAddr ON Requests
> (REMOTE_ADDR, ID)
>
> But that didn't seem to help the speed any.
>
> Any other ideas?
>
> Thanks.
>
>
> "Dan Guzman" <guzmanda@nospam-online.sbcglobal.net> wrote in message
> news:09gFe.2034$PC7.1092@newssvr29.news.prodigy.net... 
>
>



Report this thread to moderator Post Follow-up to this message
Old Post
Dan Guzman
07-26-05 12:23 PM


Re: Very slow distinct select
Rich (no@spam.invalid)  writes:
> What the heck is DDL???
> Didn't I give enough info in my two posts?
> And what does the name matter?

DDL = Data Definition Laguage. That is CREATE TABLE etc. Which it appears
that you did include.

Too many SQL savvies just throw this funny TLAs(*) around them, and
don't give a damn if people they pretend to help understand them or not.


(*) TLA = Three-letter abbriviation.

--
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techin.../2000/books.asp

Report this thread to moderator Post Follow-up to this message
Old Post
Erland Sommarskog
07-26-05 12:23 PM


Re: Very slow distinct select
Rich (no@spam.invalid)  writes:
> My table looks like this:
> char(150) HTTP_REF,
> char(250) HTTP_USER,
> char(150) REMOTE_ADDR,
> char(150) REMOTE_HOST,
> char(150) URL,
> smalldatetime TIME_STAMP
>
> There are no indexes on this table and there are only 293,658 records
> total.
>
> When I do a select like this it takes forever:
>
> SELECT COUNT(DISTINCT REMOTE_ADDR)
>
> Takes 2 minutes. Is there anyway to speed that up?

Assuming that these columns are not really of fixed length, there is
quite something to win by changing char to varchar. char is fixed
length. Your row size is 854 bytes, which means that you can get at
most 9 rows per page. Thus your table requires 32628 pages @ 8192
bytes for a total of 267 MB. Which is a small table.

Say that the average length of all columns is 40 bytes. Then your average
row size is 5*(40+2) + 4 =214, giving 37 rows per page. This reduces the
size of the table to 65 MB.

The smaller the pages, the fewer pages to read, and thus the shorter the
response time.

Adding a non-clustered index as Dan suggested is of course even better
because then the table you traverse is only the index. And if you
use varchar instead that index is even smaller.

Note that I and Dan assume that your query really is

SELECT COUNT(DISTINCT REMOTE_ADDR) FROM tbl

and nothing else. If there is a WHERE clause involved that refers to
other columns, then the index is not that useful.


--
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techin.../2000/books.asp

Report this thread to moderator Post Follow-up to this message
Old Post
Erland Sommarskog
07-26-05 12:23 PM


Re: Very slow distinct select
Thanks Dan and Erland... it has gone from 2 minutes all the way down to 1
second!
But in the process, I accidentally truncated a lot of URLs to 50. OH WELL.
At least its faster :) Thanks.


"Rich" <no@spam.invalid> wrote in message
news:DbfFe.48792$4o.35562@fed1read06...
> My table looks like this:
> char(150) HTTP_REF,
> char(250) HTTP_USER,
> char(150) REMOTE_ADDR,
> char(150) REMOTE_HOST,
> char(150) URL,
> smalldatetime TIME_STAMP
>
> There are no indexes on this table and there are only 293,658 records
total.
>
> When I do a select like this it takes forever:
>
> SELECT COUNT(DISTINCT REMOTE_ADDR)
>
> Takes 2 minutes. Is there anyway to speed that up?
>
> Thanks
>
>



Report this thread to moderator Post Follow-up to this message
Old Post
Rich
07-27-05 01:24 AM


Sponsored Links





Last Thread Next Thread
Post New Thread

Microsoft SQL Server forum archive

Show a Printable Version Email This Page to Someone! Receive updates to this thread
Microsoft SQL Server
Access database support
PostgreSQL Replication
SQL Server ODBC
FoxPro Support
PostgreSQL pgAdmin
SQL Server Clustering
MySQL ODBC
Web Applications with dBASE
SQL Server CE
MySQL++
Sybase Database Support
MS SQL Full Text Search
PostgreSQL Administration
SQL Anywhere support
DB2 UDB Database
Paradox Database Support
Filemaker Database
Berkley DB
SQL 2000/2000i database
ASE Database
Forum Jump:
All times are GMT. The time now is 12:12 PM.

 
Mobile devices forum | Database support forum archive




Copyrights DropTable.com Database Support Forum 2004 - 2006