Home > Archive > MS SQL Server > March 2006 > ways to improve performance of SQL queries









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author ways to improve performance of SQL queries
sahon

2006-03-23, 1:23 pm

Hello

I have table:

CREATE TABLE [dbo].[Data_TD] (
[iID_Dataset] [int] NOT NULL ,
[dDate] [smalldatetime] NOT NULL ,
[fVal] [real] NOT NULL
) ON [PRIMARY]
GO

ALTER TABLE [dbo].[Data_TD] WITH NOCHECK ADD
CONSTRAINT [PK_Data_TD] PRIMARY KEY CLUSTERED
(
[iID_Dataset],
[dDate]
) ON [PRIMARY]
GO

this table can contain millions of records, I need to query this table with
conditions:

iID_Dataset=@iID_Dat
aset, dDate between @startDate and @endDate. Such
conditions can be a lot, for example 100-1000. I used different strategies
to retrieve data:

1. single query
select * from Data_TD where
(iID_Dataset=@iID_Da
taset1 and dDate between @startDate1 and @endDate1) or
(iID_Dataset=@iID_Da
taset2 and dDate between @startDate2 and @endDate2) or
.........................
(iID_Dataset=@iID_Da
taset_n and dDate between @startDate_n and @endDate_n)
2. many results
select * from Data_TD where iID_Dataset=@iID_Dat
aset1 and dDate between
@startDate1 and @endDate1
select * from Data_TD where iID_Dataset=@iID_Dat
aset2 and dDate between
@startDate2 and @endDate2
..........................
select * from Data_TD where iID_Dataset=@iID_Dat
aset_n and dDate between
@startDate_n and @endDate_n

parameters are passed from c# application

I do not really like performance, sometimes it can be 1 hour to retrieve 400
datasets.

Could you propose something to improve performance?

Regards, Sahon


Prashant Barnwal

2006-03-23, 1:23 pm

do not use *, instead use [column list]
use locking hints (for select use nolock)

thx,
PB

"sahon" <sahon@community.nospam> wrote in message
news:eLKJmYoTGHA.4384@tk2msftngp13.phx.gbl...
> Hello
>
> I have table:
>
> CREATE TABLE [dbo].[Data_TD] (
> [iID_Dataset] [int] NOT NULL ,
> [dDate] [smalldatetime] NOT NULL ,
> [fVal] [real] NOT NULL
> ) ON [PRIMARY]
> GO
>
> ALTER TABLE [dbo].[Data_TD] WITH NOCHECK ADD
> CONSTRAINT [PK_Data_TD] PRIMARY KEY CLUSTERED
> (
> [iID_Dataset],
> [dDate]
> ) ON [PRIMARY]
> GO
>
> this table can contain millions of records, I need to query this table
> with conditions:
>
> iID_Dataset=@iID_Dat
aset, dDate between @startDate and @endDate. Such
> conditions can be a lot, for example 100-1000. I used different strategies
> to retrieve data:
>
> 1. single query
> select * from Data_TD where
> (iID_Dataset=@iID_Da
taset1 and dDate between @startDate1 and @endDate1)
> or
> (iID_Dataset=@iID_Da
taset2 and dDate between @startDate2 and @endDate2)
> or
> .........................
> (iID_Dataset=@iID_Da
taset_n and dDate between @startDate_n and
> @endDate_n)
> 2. many results
> select * from Data_TD where iID_Dataset=@iID_Dat
aset1 and dDate between
> @startDate1 and @endDate1
> select * from Data_TD where iID_Dataset=@iID_Dat
aset2 and dDate between
> @startDate2 and @endDate2
> .........................
> select * from Data_TD where iID_Dataset=@iID_Dat
aset_n and dDate between
> @startDate_n and @endDate_n
>
> parameters are passed from c# application
>
> I do not really like performance, sometimes it can be 1 hour to retrieve
> 400 datasets.
>
> Could you propose something to improve performance?
>
> Regards, Sahon
>



_Stephen

2006-03-23, 1:23 pm


"sahon" <sahon@community.nospam> wrote in message
news:eLKJmYoTGHA.4384@tk2msftngp13.phx.gbl...
> Hello
>
> I have table:
>
> CREATE TABLE [dbo].[Data_TD] (
> [iID_Dataset] [int] NOT NULL ,
> [dDate] [smalldatetime] NOT NULL ,
> [fVal] [real] NOT NULL
> ) ON [PRIMARY]
> GO
>
> ALTER TABLE [dbo].[Data_TD] WITH NOCHECK ADD
> CONSTRAINT [PK_Data_TD] PRIMARY KEY CLUSTERED
> (
> [iID_Dataset],
> [dDate]
> ) ON [PRIMARY]
> GO
>
> this table can contain millions of records, I need to query this table
> with conditions:
>
> iID_Dataset=@iID_Dat
aset, dDate between @startDate and @endDate. Such
> conditions can be a lot, for example 100-1000. I used different strategies
> to retrieve data:
>
> 1. single query
> select * from Data_TD where
> (iID_Dataset=@iID_Da
taset1 and dDate between @startDate1 and @endDate1)
> or
> (iID_Dataset=@iID_Da
taset2 and dDate between @startDate2 and @endDate2)
> or
> .........................
> (iID_Dataset=@iID_Da
taset_n and dDate between @startDate_n and
> @endDate_n)
> 2. many results
> select * from Data_TD where iID_Dataset=@iID_Dat
aset1 and dDate between
> @startDate1 and @endDate1
> select * from Data_TD where iID_Dataset=@iID_Dat
aset2 and dDate between
> @startDate2 and @endDate2
> .........................
> select * from Data_TD where iID_Dataset=@iID_Dat
aset_n and dDate between
> @startDate_n and @endDate_n
>
> parameters are passed from c# application
>
> I do not really like performance, sometimes it can be 1 hour to retrieve
> 400 datasets.
>
> Could you propose something to improve performance?


Create an index with the iID_Dataset and dDate columns. Then try it again.

Now when you say 400 datasets, you mean that you are presenting 400+ date
ranges in an hour which I guess, or are you only reciving 400 rows back in
an hours time?





SQLpro [MVP]

2006-03-23, 1:23 pm

sahon a écrit :
> Hello
>
> I have table:
>
> CREATE TABLE [dbo].[Data_TD] (
> [iID_Dataset] [int] NOT NULL ,
> [dDate] [smalldatetime] NOT NULL ,
> [fVal] [real] NOT NULL
> ) ON [PRIMARY]
> GO
>
> ALTER TABLE [dbo].[Data_TD] WITH NOCHECK ADD
> CONSTRAINT [PK_Data_TD] PRIMARY KEY CLUSTERED
> (
> [iID_Dataset],
> [dDate]
> ) ON [PRIMARY]
> GO


A CLUSTER index is a good solution ONLY if the values that you are
inserting upon the time are CONTINUOUSLY increasing for the concatened
value of the 2 col values.

In your data model I do not think it is good. So have a non CLUSTER
index in this case.

Also this index cannot be activate for queries with a WHERE clause based
only on dDate values.

Never use * in SELECT clause.

Instead of :
select * from Data_TD where
(iID_Dataset=@iID_Da
taset1 and dDate between @startDate1 and
@endDate1) or
(iID_Dataset=@iID_Da
taset2 and dDate between @startDate2 and
@endDate2) or
.........................
(iID_Dataset=@iID_Da
taset_n and dDate between @startDate_n and
@endDate_n)

use a table variable and join it.

A +



>
> this table can contain millions of records, I need to query this table with
> conditions:
>
> iID_Dataset=@iID_Dat
aset, dDate between @startDate and @endDate. Such
> conditions can be a lot, for example 100-1000. I used different strategies
> to retrieve data:
>
> 1. single query
> select * from Data_TD where
> (iID_Dataset=@iID_Da
taset1 and dDate between @startDate1 and @endDate1) or
> (iID_Dataset=@iID_Da
taset2 and dDate between @startDate2 and @endDate2) or
> .........................
> (iID_Dataset=@iID_Da
taset_n and dDate between @startDate_n and @endDate_n)
> 2. many results
> select * from Data_TD where iID_Dataset=@iID_Dat
aset1 and dDate between
> @startDate1 and @endDate1
> select * from Data_TD where iID_Dataset=@iID_Dat
aset2 and dDate between
> @startDate2 and @endDate2
> .........................
> select * from Data_TD where iID_Dataset=@iID_Dat
aset_n and dDate between
> @startDate_n and @endDate_n
>
> parameters are passed from c# application
>
> I do not really like performance, sometimes it can be 1 hour to retrieve 400
> datasets.
>
> Could you propose something to improve performance?
>
> Regards, Sahon
>
>



--
Frédéric BROUARD, MVP SQL Server, expert bases de données et langage SQL
Le site sur le langage SQL et les SGBDR : http://sqlpro.developpez.com
Audit, conseil, expertise, formation, modélisation, tuning, optimisation
********************
* http://www.datasapiens.com ********************
***
sahon

2006-03-24, 7:26 am


> Now when you say 400 datasets, you mean that you are presenting 400+ date
> ranges in an hour which I guess, or are you only reciving 400 rows back in
> an hours time?


400 datasets it's 400 conditions like (where iID_Dataset=@iID_Dat
aset1 and
dDate between
> @startDate1 and @endDate1)



sahon

2006-03-24, 7:26 am


> A CLUSTER index is a good solution ONLY if the values that you are
> inserting upon the time are CONTINUOUSLY increasing for the concatened
> value of the 2 col values.
>
> In your data model I do not think it is good. So have a non CLUSTER index
> in this case.


ok, what's better to use?
1. cluster index with identity column and indexes:
1. iID - identity, clustered
dDate, iIDDataset
dDate
iID_Dataset
or
2. only
dDate, iIDDataset
dDate
iID_Dataset - no clustered indexes at all

> use a table variable and join it.


what's better temp table to join
or temp variable to join? in second case I cannot use indexes on this temp
variable


Doug

2006-03-25, 1:23 pm

how unique is iID_Dataset? In other words, is it ALWAYS equal to a
specific value, or does it change for every row, or ???

How unique is dDate? All values spread over evenly over 10 years, or 10
million rows spread across 3 days?

For grins, you might try to create an additional non-clustered index on
dDate, iidDataset.

The goal is to find an index, and hints to the engine about the best
way to zero in on the data you are after. Part of the problem is
finding the combination that allows the engine to do its job
efficiently.

Sponsored Links





Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive | Programming forum archive

Copyright 2009 droptable.com