|
Home > Archive > MySQL ODBC Connector > May 2005 > performance on single column index with few distinct values
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
performance on single column index with few distinct values
|
|
| Terence 2005-05-28, 3:23 am |
| Hi list,
I have run into problems on a master table for our helpdesk. We have the
following table:
ticket_id (int) - autoincrement (indexed)
master_id (int) (indexed)
Master ID is used to distinguish multiple helpdesks. In this table there
are 100k records, but only 10 distinct master_id's.
For example:
ticket_id master_id
1 1
2 1
3 2
4 2
5 3
.... ...
When trying to do pagination I use the following SQL:
SELECT ticket_id
FROM my_table
WHERE master_id = '1'
ORDER BY ticket_id DESC
LIMIT 0,10
The problem is that there are 20k records where master_id = 1, so the
lookup is pretty slow especially when I start joining other tables. When
joining other tables the query gets slower and slower, I guess because
the lookups on joining tables result in fewer rows being joined when
using EXPLAIN.
SELECT *
FROM helpdesk_tickets ht, helpdesk_category_ma
ster hcm,
helpdesk_sub_categor
y_master hscm
WHERE ht.master_id = '1'
AND ht.category_id = hcm.category_id
AND ht.sub_category_id = hscm.sub_category_id
ORDER BY ticket_id DESC
LIMIT 0,10
I have thought of options such as using temporary tables to just grab
the last 10 tickets and then do an IN query, however I need to display
totals, so that would require me to run the query again.
My questions are:
1) Is there any point to having an index on a column with so few unique
values?
2) Would it make more sense to have multiple master tables for each
helpdesk? Such as:
helpdesk_tickets_1
helpdesk_tickets_2
helpdesk_tickets_3 etc.
and then using a session value to query the table?
3) Any other tips or advice? (I notice my query time doubles from 100k
rows to 150k rows)
Thanks for any help...
Terence
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql? unsub...sie.nctu.edu.tw
| |
| Daniel 2005-05-29, 3:23 am |
| Is there a composite index on (master_id, ticket_id)? Since your queries
are selecting on a particular master_id, and ordering by ticket_id, along
with the limit I think MySQL would be able to use such an index in an
optimization.
-Daniel
-----Original Message-----
From: Terence & #91;mailto:terence@s
unway.edu.my]
Sent: Friday, May 27, 2005 10:52 PM
To: mysql@lists.mysql.com
Subject: performance on single column index with few distinct values
Hi list,
I have run into problems on a master table for our helpdesk. We have the
following table:
ticket_id (int) - autoincrement (indexed)
master_id (int) (indexed)
Master ID is used to distinguish multiple helpdesks. In this table there
are 100k records, but only 10 distinct master_id's.
For example:
ticket_id master_id
1 1
2 1
3 2
4 2
5 3
.... ...
When trying to do pagination I use the following SQL:
SELECT ticket_id
FROM my_table
WHERE master_id = '1'
ORDER BY ticket_id DESC
LIMIT 0,10
The problem is that there are 20k records where master_id = 1, so the
lookup is pretty slow especially when I start joining other tables. When
joining other tables the query gets slower and slower, I guess because
the lookups on joining tables result in fewer rows being joined when
using EXPLAIN.
SELECT *
FROM helpdesk_tickets ht, helpdesk_category_ma
ster hcm,
helpdesk_sub_categor
y_master hscm
WHERE ht.master_id = '1'
AND ht.category_id = hcm.category_id
AND ht.sub_category_id = hscm.sub_category_id
ORDER BY ticket_id DESC
LIMIT 0,10
I have thought of options such as using temporary tables to just grab
the last 10 tickets and then do an IN query, however I need to display
totals, so that would require me to run the query again.
My questions are:
1) Is there any point to having an index on a column with so few unique
values?
2) Would it make more sense to have multiple master tables for each
helpdesk? Such as:
helpdesk_tickets_1
helpdesk_tickets_2
helpdesk_tickets_3 etc.
and then using a session value to query the table?
3) Any other tips or advice? (I notice my query time doubles from 100k
rows to 150k rows)
Thanks for any help...
Terence
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql? unsub...node
.com
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql? unsub...sie.nctu.edu.tw
| |
| Jochem van Dieten 2005-05-29, 3:23 am |
| On 5/28/05, Terence wrote:
>=20
> Master ID is used to distinguish multiple helpdesks. In this table there
> are 100k records, but only 10 distinct master_id's.
>=20
> ticket_id master_id
> 1 1
> 2 1
> 3 2
> 4 2
> 5 3
> ... ...
> SELECT *
> FROM helpdesk_tickets ht, helpdesk_category_ma
ster hcm,
> helpdesk_sub_categor
y_master hscm
> WHERE ht.master_id =3D '1'
> AND ht.category_id =3D hcm.category_id
> AND ht.sub_category_id =3D hscm.sub_category_id
> ORDER BY ticket_id DESC
> LIMIT 0,10
With proper foreign keys all tickets are guaranteed to match
categories and subcategories. That means you can push down the LIMIT
to just the helpdesk_tickets table:
SELECT *
FROM (
SELECT *
FROM helpdesk_tickets
WHERE master_id =3D 1
ORDER BY ticket_id DESC
LIMIT 0,10)
NATURAL JOIN helpdesk_category_ma
ster
NATURAL JOIN helpdesk_sub_categor
y_master
With an index on (master_id, ticket_id) this should (nearly) constant-time.
> I have thought of options such as using temporary tables to just grab
> the last 10 tickets and then do an IN query, however I need to display
> totals, so that would require me to run the query again.
At which point does running the query again become faster then your
current method?
Jochem
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql? unsub...sie.nctu.edu.tw
| |
| Terence 2005-05-30, 3:23 am |
| Hi Daniel,
Here's some more data which might help. This query does not appear to be
using the composite index although there is one on the master_id and
ticket_id. The below query takes about 3 seconds. Anyway I think I will
still move to seperate tables for each master id. Thanks for the reply.
EXPLAIN SELECT *
FROM helpdesk_tickets ht
WHERE ht.master_id = '1'
ORDER BY ticket_id DESC
LIMIT 0,10
id,select_type,table
,type,possible_keys,
key,key_len,ref,rows
,Extra,
1,SIMPLE,ht,range,ma
ster_id,master_id,4,
,168439,Using where; Using
filesort,
CREATE TABLE `helpdesk_tickets` (
`ticket_id` int(5) NOT NULL auto_increment,
`running_id` int(5) NOT NULL default '0',
`master_id` int(5) NOT NULL default '0',
`category_id` int(5) NOT NULL default '0',
`sub_category_id` int(5) NOT NULL default '0',
`priority_id` int(5) NOT NULL default '0',
`location_id` int(5) NOT NULL default '0',
`status_id` int(5) NOT NULL default '0',
`user_logging_id` int(5) NOT NULL default '0',
`user_problem_id` int(5) default '0',
`other_user_problem`
varchar(255) NOT NULL default '',
`title` varchar(255) NOT NULL default '',
`description` text NOT NULL,
`submit_date` datetime default NULL,
`submit_month` tinyint(2) NOT NULL default '0',
`complete_date` datetime default NULL,
`feedback_id` int(5) NOT NULL default '0',
`review_id` int(5) NOT NULL default '0',
`feedback_comments` text NOT NULL,
`review_comments` text NOT NULL,
`significance_id` int(5) NOT NULL default '0',
`time_spent` int(5) NOT NULL default '0',
`requires_ams_update
` enum('y','n') NOT NULL default 'n',
`duplicate_ticket` enum('y','n') NOT NULL default 'n',
`friendly_id` int(5) NOT NULL default '0',
`due_date` datetime NOT NULL default '0000-00-00 00:00:00',
PRIMARY KEY (`ticket_id`),
UNIQUE KEY `problem_id` (`ticket_id`),
KEY `priority_id` (`priority_id`),
KEY `category_id` (`category_id`),
KEY `sub_category_id` (`sub_category_id`),
KEY `location_id` (`location_id`),
KEY `user_logging_id` (`user_logging_id`),
KEY `user_problem_id` (`user_problem_id`),
KEY `status_id` (`status_id`),
KEY `submit_date` (`submit_date`),
KEY `running_id` (`running_id`),
KEY `submit_month` (`submit_month`),
KEY `master_id` (`master_id`),
KEY `comp_ticket_master`
(`ticket_id`,`master
_id`),
FOREIGN KEY (`user_logging_id`) REFERENCES `intranet_user_login
`
(`user_login_id`),
FOREIGN KEY (`sub_category_id`) REFERENCES
`helpdesk_sub_catego
ry_master` (`sub_category_id`),
FOREIGN KEY (`category_id`) REFERENCES `helpdesk_category_m
aster`
(`category_id`)
) TYPE=InnoDB CHARSET=latin1 ROW_FORMAT=DYNAMIC COMMENT='InnoDB free:
97280 kB'
Daniel wrote:
>Is there a composite index on (master_id, ticket_id)? Since your queries
>are selecting on a particular master_id, and ordering by ticket_id, along
>with the limit I think MySQL would be able to use such an index in an
>optimization.
>
>-Daniel
>
>-----Original Message-----
>From: Terence & #91;mailto:terence@s
unway.edu.my]
>Sent: Friday, May 27, 2005 10:52 PM
>To: mysql@lists.mysql.com
>Subject: performance on single column index with few distinct values
>
>
>Hi list,
>
>I have run into problems on a master table for our helpdesk. We have the
>following table:
>
>ticket_id (int) - autoincrement (indexed)
>master_id (int) (indexed)
>
>Master ID is used to distinguish multiple helpdesks. In this table there
>are 100k records, but only 10 distinct master_id's.
>For example:
>
>ticket_id master_id
>1 1
>2 1
>3 2
>4 2
>5 3
>... ...
>
>When trying to do pagination I use the following SQL:
>
>SELECT ticket_id
>FROM my_table
>WHERE master_id = '1'
>ORDER BY ticket_id DESC
>LIMIT 0,10
>
>The problem is that there are 20k records where master_id = 1, so the
>lookup is pretty slow especially when I start joining other tables. When
>joining other tables the query gets slower and slower, I guess because
>the lookups on joining tables result in fewer rows being joined when
>using EXPLAIN.
>
>SELECT *
>FROM helpdesk_tickets ht, helpdesk_category_ma
ster hcm,
> helpdesk_sub_categor
y_master hscm
>WHERE ht.master_id = '1'
>AND ht.category_id = hcm.category_id
>AND ht.sub_category_id = hscm.sub_category_id
>ORDER BY ticket_id DESC
>LIMIT 0,10
>
>I have thought of options such as using temporary tables to just grab
>the last 10 tickets and then do an IN query, however I need to display
>totals, so that would require me to run the query again.
>
>My questions are:
>
>1) Is there any point to having an index on a column with so few unique
>values?
>2) Would it make more sense to have multiple master tables for each
>helpdesk? Such as:
>helpdesk_tickets_1
>helpdesk_tickets_2
>helpdesk_tickets_3 etc.
>and then using a session value to query the table?
>3) Any other tips or advice? (I notice my query time doubles from 100k
>rows to 150k rows)
>
>Thanks for any help...
>
>Terence
>
>
>
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql? unsub...sie.nctu.edu.tw
| |
| mfatene@free.fr 2005-05-30, 3:23 am |
| Hi,
remember that using index can be slower that FTS. if there is a low index column
selectivity, it's normal that your query doesn't use it.
try using some hints to force index usage (like in oracle) or put your table as
an index organiztion table.
look also at myisam merge tables.
mathias
Selon Terence <terence@sunway.edu.my>:
> Hi Daniel,
>
> Here's some more data which might help. This query does not appear to be
> using the composite index although there is one on the master_id and
> ticket_id. The below query takes about 3 seconds. Anyway I think I will
> still move to seperate tables for each master id. Thanks for the reply.
>
> EXPLAIN SELECT *
> FROM helpdesk_tickets ht
> WHERE ht.master_id = '1'
> ORDER BY ticket_id DESC
> LIMIT 0,10
>
> id,select_type,table
,type,possible_keys,
key,key_len,ref,rows
,Extra,
> 1,SIMPLE,ht,range,ma
ster_id,master_id,4,
,168439,Using where; Using
> filesort,
>
> CREATE TABLE `helpdesk_tickets` (
> `ticket_id` int(5) NOT NULL auto_increment,
> `running_id` int(5) NOT NULL default '0',
> `master_id` int(5) NOT NULL default '0',
> `category_id` int(5) NOT NULL default '0',
> `sub_category_id` int(5) NOT NULL default '0',
> `priority_id` int(5) NOT NULL default '0',
> `location_id` int(5) NOT NULL default '0',
> `status_id` int(5) NOT NULL default '0',
> `user_logging_id` int(5) NOT NULL default '0',
> `user_problem_id` int(5) default '0',
> `other_user_problem`
varchar(255) NOT NULL default '',
> `title` varchar(255) NOT NULL default '',
> `description` text NOT NULL,
> `submit_date` datetime default NULL,
> `submit_month` tinyint(2) NOT NULL default '0',
> `complete_date` datetime default NULL,
> `feedback_id` int(5) NOT NULL default '0',
> `review_id` int(5) NOT NULL default '0',
> `feedback_comments` text NOT NULL,
> `review_comments` text NOT NULL,
> `significance_id` int(5) NOT NULL default '0',
> `time_spent` int(5) NOT NULL default '0',
> `requires_ams_update
` enum('y','n') NOT NULL default 'n',
> `duplicate_ticket` enum('y','n') NOT NULL default 'n',
> `friendly_id` int(5) NOT NULL default '0',
> `due_date` datetime NOT NULL default '0000-00-00 00:00:00',
> PRIMARY KEY (`ticket_id`),
> UNIQUE KEY `problem_id` (`ticket_id`),
> KEY `priority_id` (`priority_id`),
> KEY `category_id` (`category_id`),
> KEY `sub_category_id` (`sub_category_id`),
> KEY `location_id` (`location_id`),
> KEY `user_logging_id` (`user_logging_id`),
> KEY `user_problem_id` (`user_problem_id`),
> KEY `status_id` (`status_id`),
> KEY `submit_date` (`submit_date`),
> KEY `running_id` (`running_id`),
> KEY `submit_month` (`submit_month`),
> KEY `master_id` (`master_id`),
> KEY `comp_ticket_master`
(`ticket_id`,`master
_id`),
> FOREIGN KEY (`user_logging_id`) REFERENCES `intranet_user_login
`
> (`user_login_id`),
> FOREIGN KEY (`sub_category_id`) REFERENCES
> `helpdesk_sub_catego
ry_master` (`sub_category_id`),
> FOREIGN KEY (`category_id`) REFERENCES `helpdesk_category_m
aster`
> (`category_id`)
> ) TYPE=InnoDB CHARSET=latin1 ROW_FORMAT=DYNAMIC COMMENT='InnoDB free:
> 97280 kB'
>
> Daniel wrote:
>
>
> --
> MySQL General Mailing List
> For list archives: http://lists.mysql.com/mysql
> To unsubscribe: http://lists.mysql.com/mysql?unsub=mfatene@free.fr
>
>
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql? unsub...sie.nctu.edu.tw
| |
| Daniel 2005-05-30, 9:23 am |
| Actually, given the below CREATE statement, there is not an index on
(master_id, ticket_id). There's KEY `comp_ticket_master`
(`ticket_id`,`master
_id`),
which is not the same--the order in which the columns are specified matters.
If you try a (master_id, ticket_id) index and MySQL still won't use
it, trying forcing it to at least see what affect it has, like
"... FROM helpdesk_tickets ht FORCE INDEX (comp_master_ticket)
, ..."
-Daniel
-----Original Message-----
From: Terence & #91;mailto:terence@s
unway.edu.my]
Sent: Sunday, May 29, 2005 9:12 PM
To: mysql@lists.mysql.com
Subject: Re: performance on single column index with few distinct values
Hi Daniel,
Here's some more data which might help. This query does not appear to be
using the composite index although there is one on the master_id and
ticket_id. The below query takes about 3 seconds. Anyway I think I will
still move to seperate tables for each master id. Thanks for the reply.
EXPLAIN SELECT *
FROM helpdesk_tickets ht
WHERE ht.master_id = '1'
ORDER BY ticket_id DESC
LIMIT 0,10
id,select_type,table
,type,possible_keys,
key,key_len,ref,rows
,Extra,
1,SIMPLE,ht,range,ma
ster_id,master_id,4,
,168439,Using where; Using
filesort,
CREATE TABLE `helpdesk_tickets` (
`ticket_id` int(5) NOT NULL auto_increment,
`running_id` int(5) NOT NULL default '0',
`master_id` int(5) NOT NULL default '0',
`category_id` int(5) NOT NULL default '0',
`sub_category_id` int(5) NOT NULL default '0',
`priority_id` int(5) NOT NULL default '0',
`location_id` int(5) NOT NULL default '0',
`status_id` int(5) NOT NULL default '0',
`user_logging_id` int(5) NOT NULL default '0',
`user_problem_id` int(5) default '0',
`other_user_problem`
varchar(255) NOT NULL default '',
`title` varchar(255) NOT NULL default '',
`description` text NOT NULL,
`submit_date` datetime default NULL,
`submit_month` tinyint(2) NOT NULL default '0',
`complete_date` datetime default NULL,
`feedback_id` int(5) NOT NULL default '0',
`review_id` int(5) NOT NULL default '0',
`feedback_comments` text NOT NULL,
`review_comments` text NOT NULL,
`significance_id` int(5) NOT NULL default '0',
`time_spent` int(5) NOT NULL default '0',
`requires_ams_update
` enum('y','n') NOT NULL default 'n',
`duplicate_ticket` enum('y','n') NOT NULL default 'n',
`friendly_id` int(5) NOT NULL default '0',
`due_date` datetime NOT NULL default '0000-00-00 00:00:00',
PRIMARY KEY (`ticket_id`),
UNIQUE KEY `problem_id` (`ticket_id`),
KEY `priority_id` (`priority_id`),
KEY `category_id` (`category_id`),
KEY `sub_category_id` (`sub_category_id`),
KEY `location_id` (`location_id`),
KEY `user_logging_id` (`user_logging_id`),
KEY `user_problem_id` (`user_problem_id`),
KEY `status_id` (`status_id`),
KEY `submit_date` (`submit_date`),
KEY `running_id` (`running_id`),
KEY `submit_month` (`submit_month`),
KEY `master_id` (`master_id`),
KEY `comp_ticket_master`
(`ticket_id`,`master
_id`),
FOREIGN KEY (`user_logging_id`) REFERENCES `intranet_user_login
`
(`user_login_id`),
FOREIGN KEY (`sub_category_id`) REFERENCES
`helpdesk_sub_catego
ry_master` (`sub_category_id`),
FOREIGN KEY (`category_id`) REFERENCES `helpdesk_category_m
aster`
(`category_id`)
) TYPE=InnoDB CHARSET=latin1 ROW_FORMAT=DYNAMIC COMMENT='InnoDB free:
97280 kB'
Daniel wrote:
>Is there a composite index on (master_id, ticket_id)? Since your queries
>are selecting on a particular master_id, and ordering by ticket_id, along
>with the limit I think MySQL would be able to use such an index in an
>optimization.
>
>-Daniel
>
>-----Original Message-----
>From: Terence & #91;mailto:terence@s
unway.edu.my]
>Sent: Friday, May 27, 2005 10:52 PM
>To: mysql@lists.mysql.com
>Subject: performance on single column index with few distinct values
>
>
>Hi list,
>
>I have run into problems on a master table for our helpdesk. We have the
>following table:
>
>ticket_id (int) - autoincrement (indexed)
>master_id (int) (indexed)
>
>Master ID is used to distinguish multiple helpdesks. In this table there
>are 100k records, but only 10 distinct master_id's.
>For example:
>
>ticket_id master_id
>1 1
>2 1
>3 2
>4 2
>5 3
>... ...
>
>When trying to do pagination I use the following SQL:
>
>SELECT ticket_id
>FROM my_table
>WHERE master_id = '1'
>ORDER BY ticket_id DESC
>LIMIT 0,10
>
>The problem is that there are 20k records where master_id = 1, so the
>lookup is pretty slow especially when I start joining other tables. When
>joining other tables the query gets slower and slower, I guess because
>the lookups on joining tables result in fewer rows being joined when
>using EXPLAIN.
>
>SELECT *
>FROM helpdesk_tickets ht, helpdesk_category_ma
ster hcm,
> helpdesk_sub_categor
y_master hscm
>WHERE ht.master_id = '1'
>AND ht.category_id = hcm.category_id
>AND ht.sub_category_id = hscm.sub_category_id
>ORDER BY ticket_id DESC
>LIMIT 0,10
>
>I have thought of options such as using temporary tables to just grab
>the last 10 tickets and then do an IN query, however I need to display
>totals, so that would require me to run the query again.
>
>My questions are:
>
>1) Is there any point to having an index on a column with so few unique
>values?
>2) Would it make more sense to have multiple master tables for each
>helpdesk? Such as:
>helpdesk_tickets_1
>helpdesk_tickets_2
>helpdesk_tickets_3 etc.
>and then using a session value to query the table?
>3) Any other tips or advice? (I notice my query time doubles from 100k
>rows to 150k rows)
>
>Thanks for any help...
>
>Terence
>
>
>
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql? unsub...node
.com
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql? unsub...sie.nctu.edu.tw
|
|
|
|
|