Home > Archive > Microsoft SQL Server forum > July 2005 > My design - where did I go wrong?









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author My design - where did I go wrong?
Rich

2005-07-29, 1:23 pm

CREATE TABLE Sales1
(
varchar (10) CustID,
varchar (10) TransID,
datetime SaleDate,
money S1,
money S2,
money S3,
money S4,
numeric V1
)

CREATE NONCLUSTERED INDEX Sales1_CustID ON Sales1 (CustID)

CREATE NONCLUSTERED INDEX Sales1_SaleDate ON Sales1 (SaleDate)

"money" is just the right size for my fields. This table has 9,500,000
records.

Although I need to do this select in less than a half second, it takes 1
full minute:
SELECT MAX(S1) FROM Sales1 WHERE SaleDate > '1/1/2005'

And I need to do this select in less than a half second, but it takes 3
minutes:
SELECT AVG(S1 / S2) FROM Sales1 WHERE S3 > S4

Am I supposed to create a new field with these values pre-calculated? I hope
not, because I have several other formulas - up to 500 different types of
selects which are all similar.

I am testing with SQL 2000 Developer. XEON CPU, fast SATA hard drive and 2GB
of memory.

Thanks


Gert-Jan Strik

2005-07-29, 8:24 pm

See inline

Rich wrote:
>
> CREATE TABLE Sales1
> (
> varchar (10) CustID,
> varchar (10) TransID,
> datetime SaleDate,
> money S1,
> money S2,
> money S3,
> money S4,
> numeric V1
> )


This is not proper DDL. In a proper CREATE TABLE statement the column
name comes first, then its data type.

The columns S1, S2, S3, S4 and V1 have very poor names. You haven't
specified what these columns represent. I sure hope your table is
properly normalized.

> CREATE NONCLUSTERED INDEX Sales1_CustID ON Sales1 (CustID)
>
> CREATE NONCLUSTERED INDEX Sales1_SaleDate ON Sales1 (SaleDate)


Your table is missing a Primary Key constraint. What is the table's key?
Is it the combination of CustID and SaleDate? Please add a Primary Key.
It will automatically be uniquely indexed.

> "money" is just the right size for my fields. This table has 9,500,000
> records.
>
> Although I need to do this select in less than a half second, it takes 1
> full minute:
> SELECT MAX(S1) FROM Sales1 WHERE SaleDate > '1/1/2005'


If you really use a literal (as in the example above) and the index
statistics are up to date, then the query will be as fast as possible.

However, if in reality you are using a local variable or parameter (for
example ... WHERE SaleDate > @SomeDate) then SQL-Server might not know
that the index is useful. In that case you could check out the
performance if you add an Index Hint.

Please note that the preferred dateformat is '20050101' or
'2005-01-01T00:00:00', because this is a safe notation. The
interpretation of '1/1/2005' will depend on the server language.

> And I need to do this select in less than a half second, but it takes 3
> minutes:
> SELECT AVG(S1 / S2) FROM Sales1 WHERE S3 > S4


There is currently no index on (S3,S4). And even if there was, then it
is still not clear if it could be used (you would have to check that). I
doubt that you will get this query to 0.5 seconds. What does the query
mean anyway? It is a very strange query in the context of a Sales table
with a CustID column. It looks like a reporting query, not a query that
requires a performance of < 0.5 seconds.

Note that it is not safe to divide money data types, because the result
will also be a money data type. First of all, money divided by money
results in a ratio, not another money amount. Second, all precision
beyond the 4th decimal is lost in a money data type, so your ratio will
only have 4 decimals. Casting the S1 and S2 values to decimal should
solve that problem.

> Am I supposed to create a new field with these values pre-calculated? I hope
> not, because I have several other formulas - up to 500 different types of
> selects which are all similar.


You could have a look indexed views. They could hurt
Insert/Update/Delete performance, but it may enable you to run this
reporting type queries pretty fast.

On the other hand, have a good look at what you need to achieve, and if
you really need the performance you are asking for. It might mean you
have to redesign your solution, because processing full 9.5 million rows
will never be lightning fast...

Gert-Jan

> I am testing with SQL 2000 Developer. XEON CPU, fast SATA hard drive and 2GB
> of memory.
>
> Thanks

Erland Sommarskog

2005-07-29, 8:24 pm

Rich (no@spam.invalid) writes:
> CREATE TABLE Sales1
> (
> varchar (10) CustID,
> varchar (10) TransID,
> datetime SaleDate,
> money S1,
> money S2,
> money S3,
> money S4,
> numeric V1
> )
>
> CREATE NONCLUSTERED INDEX Sales1_CustID ON Sales1 (CustID)
>
> CREATE NONCLUSTERED INDEX Sales1_SaleDate ON Sales1 (SaleDate)
>
> "money" is just the right size for my fields. This table has 9,500,000
> records.
>
> Although I need to do this select in less than a half second, it takes 1
> full minute:
> SELECT MAX(S1) FROM Sales1 WHERE SaleDate > '1/1/2005'


Add S1 to the non-clustered index. Or make the index on SaleDate clustered.

> And I need to do this select in less than a half second, but it takes 3
> minutes:
> SELECT AVG(S1 / S2) FROM Sales1 WHERE S3 > S4


That's a tough one. With that requirement for response time, it seems
that you would need to add a computed column with S3-S4 and then index
that column, and add S1 and S2 to that index. And rewrite the query as
"WHERE S3_minus_S4 > 0". Alternatively define an indexed view on this
condition.

Under the current circumstances, SQL Server will have to scan the
entire table.

> Am I supposed to create a new field with these values pre-calculated? I
> hope not, because I have several other formulas - up to 500 different
> types of selects which are all similar.


Ouch. Well, it seems that you need to do a more thorough review or
the requiremens and make a new design.



--
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techin.../2000/books.asp
Rich

2005-07-29, 8:24 pm


"Erland Sommarskog" <esquel@sommarskog.se> wrote in message
news:Xns96A32A3CEF98
Yazorman@127.0.0.1...
> Rich (no@spam.invalid) writes:
>
> Add S1 to the non-clustered index. Or make the index on SaleDate

clustered.
>
>
> That's a tough one. With that requirement for response time, it seems
> that you would need to add a computed column with S3-S4 and then index
> that column, and add S1 and S2 to that index. And rewrite the query as
> "WHERE S3_minus_S4 > 0". Alternatively define an indexed view on this
> condition.
>
> Under the current circumstances, SQL Server will have to scan the
> entire table.
>
>
> Ouch. Well, it seems that you need to do a more thorough review or
> the requiremens and make a new design.
>
>
>
> --
> Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
>
> Books Online for SQL Server SP3 at
> http://www.microsoft.com/sql/techin.../2000/books.asp



Thanks Erland. I've also just started reading about Real-Time OLAP cubes and
Analysis Services. Would this help me in any way?

Richard


Erland Sommarskog

2005-07-29, 8:24 pm

Rich (no@spam.invalid) writes:
> Thanks Erland. I've also just started reading about Real-Time OLAP cubes
> and Analysis Services. Would this help me in any way?


Maybe. Or to be more frank: I don't have the faintest idea. I have
never looked Analysis Services, so I can't tell.


--
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techin.../2000/books.asp
--CELKO--

2005-07-30, 11:23 am

>> I've also just started reading about Real-Time OLAP cubes and Analysis Services. Would this help me in any way? <<

This is probably the best answer, but in the meantime you can do a VIEW
with all the summaries: Basically convert all of your WHERE clauses
into CASE expressions

SELECT MAX (CASE WHEN sale_date > '2005-01-01' THEN s1 END),
AVG (CASE WHEN S3 > S4 THEN (s1 / s2) END),
etc.
FROM Sales1 ;

Erland Sommarskog

2005-07-30, 11:23 am

--CELKO-- (jcelko212@earthlink
.net) writes:
Services. Would this help me in any way? <<[color=darkred]
>
> This is probably the best answer, but in the meantime you can do a VIEW
> with all the summaries: Basically convert all of your WHERE clauses
> into CASE expressions
>
> SELECT MAX (CASE WHEN sale_date > '2005-01-01' THEN s1 END),
> AVG (CASE WHEN S3 > S4 THEN (s1 / s2) END),
> etc.
> FROM Sales1 ;


That would have to be an indexed view to give the performance Rich is
required to produce. Since there are a lot of restrictions with indexed
views on what you can put into them, this may require quite some tweaks.


--
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techin.../2000/books.asp
--CELKO--

2005-07-30, 8:23 pm

I am guessing that he has to do a table scan for some of these 500
calculations, so if we can get it down to one scan, this will be the
best we can hope for. I have not played with indexed views, so maybe
someone else can comment here.

Sponsored Links





Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive | Programming forum archive

Copyright 2008 droptable.com