|
Home > Archive > Sybase Database > March 2006 > optimizer behaviour changed
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
optimizer behaviour changed
|
|
| Jürgen Bajdala 2006-03-08, 7:23 am |
| Hi,
we have the following table:
create table sort_tab (
katkey INTEGER NULL ,
kateginhalt VARCHAR (245) NULL ,
setnr SMALLINT NULL ,
normkatkey INTEGER NULL
)
(with a hidden SYB_IDENTITY_COL),
with the indexes:
create UNIQUE INDEX i_sort_katkey on sort_tab(katkey)
create INDEX i_sort_inhalt on sort_tab(kateginhalt
)
create INDEX i_sort_norm on sort_tab(setnr,normk
atkey)
create INDEX sort_tab_syb_id_col on sort_tab( SYB_IDENTITY_COL )
And we have the select statement:
select katkey from sort_tab order by kateginhalt
On a 11.9 server the index i_sort_inhalt is used (as we expected):
QUERY PLAN FOR STATEMENT 1 (at line 1).
STEP 1
The type of query is SELECT.
FROM TABLE
sort_tab
Nested iteration.
Index : i_sort_inhalt
Forward scan.
Positioning at index start.
Using I/O Size 2 Kbytes for index leaf pages.
With LRU Buffer Replacement Strategy for index leaf pages.
Using I/O Size 2 Kbytes for data pages.
With LRU Buffer Replacement Strategy for data pages.
On a 12.5.2 server an temp. table is build:
STEP 1
The type of query is INSERT.
The update mode is direct.
Worktable1 created, in allpages locking mode, for ORDER BY.
FROM TABLE
sort_tab
Nested iteration.
Table Scan.
Forward scan.
Positioning at start of table.
Using I/O Size 2 Kbytes for data pages.
With LRU Buffer Replacement Strategy for data pages.
TO TABLE
Worktable1.
STEP 2
The type of query is SELECT.
This step involves sorting.
FROM TABLE
Worktable1.
Using GETSORTED
Table Scan.
Forward scan.
Positioning at start of table.
Using I/O Size 2 Kbytes for data pages.
With MRU Buffer Replacement Strategy for data pages.
Does anybody know why the optimizer works in this way now?
Can I switch back to the original behaviour?
(Of course I can use an index hint for this special select,
but maybe I have trouble with other selects that I'm unaware
at the moment.)
Thanks
Jürgen
| |
| Dualta O'Briain 2006-03-08, 9:23 am |
| I'll take a guess at the answer.
Your select statement selects only a single field. That field has an
index on it. That will make simply reading that index a very attractive
proposal as the server will not have to look at any data pages at all.
Given you have also done an order by which is not in the select list -
the server must build an index to reorder the data which he is reading
directly from the index.
How big is the table ?
If an update statistics will not solve the problem - then I would try
out the query again adding the order by field to the select list
select kateginhalt, katkey from sort_tab order by kateginhalt
- just to see whether the optimizer then plumps for your original index
because you have made the other index less attractive.
If so - then you are probably caught in the space between the optimizer
rules about reading directly from indexes when only keys are being
selected and the statistics you have for your table.
Personally - I would add the hint as there is probably not much you can
do about it if the above holds true.
I make that about 2 cents worth.
Dualta.
Jürgen Bajdala wrote:
> Hi,
>
> we have the following table:
>
> create table sort_tab (
> katkey INTEGER NULL ,
> kateginhalt VARCHAR (245) NULL ,
> setnr SMALLINT NULL ,
> normkatkey INTEGER NULL
> )
>
> (with a hidden SYB_IDENTITY_COL),
> with the indexes:
>
> create UNIQUE INDEX i_sort_katkey on sort_tab(katkey)
> create INDEX i_sort_inhalt on sort_tab(kateginhalt
)
> create INDEX i_sort_norm on sort_tab(setnr,normk
atkey)
> create INDEX sort_tab_syb_id_col on sort_tab( SYB_IDENTITY_COL )
>
> And we have the select statement:
>
> select katkey from sort_tab order by kateginhalt
>
> On a 11.9 server the index i_sort_inhalt is used (as we expected):
>
> QUERY PLAN FOR STATEMENT 1 (at line 1).
> STEP 1
> The type of query is SELECT.
> FROM TABLE
> sort_tab
> Nested iteration.
> Index : i_sort_inhalt
> Forward scan.
> Positioning at index start.
> Using I/O Size 2 Kbytes for index leaf pages.
> With LRU Buffer Replacement Strategy for index leaf pages.
> Using I/O Size 2 Kbytes for data pages.
> With LRU Buffer Replacement Strategy for data pages.
>
> On a 12.5.2 server an temp. table is build:
>
> STEP 1
> The type of query is INSERT.
> The update mode is direct.
> Worktable1 created, in allpages locking mode, for ORDER BY.
>
> FROM TABLE
> sort_tab
> Nested iteration.
> Table Scan.
> Forward scan.
> Positioning at start of table.
> Using I/O Size 2 Kbytes for data pages.
> With LRU Buffer Replacement Strategy for data pages.
> TO TABLE
> Worktable1.
> STEP 2
> The type of query is SELECT.
> This step involves sorting.
>
> FROM TABLE
> Worktable1.
> Using GETSORTED
> Table Scan.
> Forward scan.
> Positioning at start of table.
> Using I/O Size 2 Kbytes for data pages.
> With MRU Buffer Replacement Strategy for data pages.
>
> Does anybody know why the optimizer works in this way now?
> Can I switch back to the original behaviour?
> (Of course I can use an index hint for this special select,
> but maybe I have trouble with other selects that I'm unaware
> at the moment.)
>
| |
| --CELKO-- 2006-03-09, 7:23 am |
| >> SELECT katkey FROM Sort_Tab ORDER BY kateginhalt; <<
This should not work at all. In Standard SQL, the columns in the ORDER
BY clause of the cursor must appear in the SELECT list. The optimizer
is probably expecting good SQL programming and gets confused by this.
Thus a temp table for the sort instead of seeing the index.
Also, including a description of the storage method ("_tab", whicnis
really silly in a language with only one data structure) or data type
or use in a particjlar location ("-key") is a violation of ISO-11179
rules. Name a data elemetn for what it is in the LOGICAL model.
| |
| Jürgen Bajdala 2006-03-09, 9:23 am |
| Dualta O'Briain wrote:
> Your select statement selects only a single field. That field has an
> index on it. That will make simply reading that index a very attractive
> proposal as the server will not have to look at any data pages at all.
>
> Given you have also done an order by which is not in the select list -
> the server must build an index to reorder the data which he is reading
> directly from the index.
>
> How big is the table ?
up to 9 Mio records
>
> If an update statistics will not solve the problem - then I would try
> out the query again adding the order by field to the select list
no change after "update statistics"
> select kateginhalt, katkey from sort_tab order by kateginhalt
same behaviour: temp. table
> Personally - I would add the hint as there is probably not much you can
> do about it if the above holds true.
We do this and it works.
But maybe we have other optimizer problems that we don't know
(and since 12.5 we have more trouble with Sybase then with 11.9).
Jürgen
| |
| Jürgen Bajdala 2006-03-09, 11:23 am |
| dbcc helps (a bit):
dbcc traceon(302)
select katkey from sort_tab order by kateginhalt
[...]
varno = 0, objectid 761053747.
The table (Datarows) has 9061436 rows, 1302390 pages,
The table's Data Page Cluster Ratio 0.999429
-----
The best sort-avert index is index 3, costing 8117949 pages
and generating 9061436 rows per scan.
Search argument selectivity is 1.000000.
The first statement about the table (9061436 rows,
1302390 pages) is right.
The index nr. 3 is the right one, the row number
is ok., but the page numer (8 Mio) is false.
The index has only 1131672 (1 Mio) pages (measured
according to FAQ, chapter 9.1.10).
It's clear that the optimizer won't use an index
scan which cost 8 times more than the table scan.
It's more a problem for Sybase Support.
Thanks
Jürgen
| |
| Mark A. Parsons 2006-03-09, 1:23 pm |
| re: "costing 8117949 pages"
This doesn't mean the table has 8M pages.
This is the estimated *cost*, in terms of the number of cache hits (page
accesses) the dataserver thinks will be required to use the nr. 3 index.
-------------------
The kateginhalt column has it's own index.
*IF* the optimizer were to access the table via the katenginhalt index, the
records would be accessed in kateginhalt order; this would eliminate the
need to sort the result set before sending back to the client. This
operation shows up in the 302 output as a 'sort-avert' operation.
The draw back to using sort-avert is that for each index key, the optimizer
has to make a separate jump over to the associated data page.
In a worse case scenario the sort-avert operation will cost X number of
pages, where X is the sum of the # of pages in the index plus the number of
rows in the table.
For this table, with 9M records, the max cost of the sort-avert operation
would be 9M (one data page access for each of the 9M records) plus the
number of pages used to store the katenginhalt index.
-------------------
The 8M you're seeing is the optimizer's estimate of the cost (in terms of
the number of page/cache hits) *IF* it were to perform a sort-avert
operation. The 8M is *NOT* the number of pages in the index.
As you've surmised, a table scan requiring only 1M pages will be more
efficient than performing a sort-avert operation that would cost ~8M
page/cache hits.
The dbcc(302) output looks 'normal'.
Jürgen Bajdala wrote:
> dbcc helps (a bit):
>
> dbcc traceon(302)
>
> select katkey from sort_tab order by kateginhalt
>
> [...]
> varno = 0, objectid 761053747.
> The table (Datarows) has 9061436 rows, 1302390 pages,
> The table's Data Page Cluster Ratio 0.999429
> -----
> The best sort-avert index is index 3, costing 8117949 pages
> and generating 9061436 rows per scan.
> Search argument selectivity is 1.000000.
>
>
> The first statement about the table (9061436 rows,
> 1302390 pages) is right.
>
> The index nr. 3 is the right one, the row number
> is ok., but the page numer (8 Mio) is false.
>
> The index has only 1131672 (1 Mio) pages (measured
> according to FAQ, chapter 9.1.10).
>
> It's clear that the optimizer won't use an index
> scan which cost 8 times more than the table scan.
>
> It's more a problem for Sybase Support.
>
> Thanks
>
> Jürgen
>
| |
| Jürgen Bajdala 2006-03-10, 7:23 am |
| Mark A. Parsons wrote:
> The draw back to using sort-avert is that for each index key, the
> optimizer has to make a separate jump over to the associated data page.
>
> In a worse case scenario the sort-avert operation will cost X number of
> pages, where X is the sum of the # of pages in the index plus the number
> of rows in the table.
Thanks for the explaination
and sorry for blaiming Sybase
Jürgen
| |
| Mark A. Parsons 2006-03-10, 11:23 am |
| The output from "dbcc traceon(3604,302,310
)" can be quite voluminous, and
very technical. It can be a little daunting the first (second, third,
fourth ... umpteenth) time you use it.
Don't despair ... there are still plenty of other areas where you *can*
blame Sybase (as with most software firms) ... ;-)
Jürgen Bajdala wrote:
> Mark A. Parsons wrote:
>
>
>
>
> Thanks for the explaination
> and sorry for blaiming Sybase
>
> Jürgen
|
|
|
|
|