January 22, 2008 at 5:27 pm
Hi,
I have a table with 100 M+ rows. As the table grows, the select queries are getting slower. In the execution plan, everything looks fine since I'm having clustered index seek. But a query can take more than 2 minutes which is too long.
Here's a typical query :
select top 500 sum(nb) as nbCount, itemId as item,
DATEADD(wk, DATEDIFF(wk, 0, dateadd(hh, 0, tbl_computedBAseReportLogs.date)), 0) as sDate
from tbl_computedBAseReportLogs where
tbl_computedBAseReportLogs.date between '2007-07-01 00:00:00' and '2008-01-22 23:59:00' and
tbl_computedBAseReportLogs.fk_tbl_websites_Id = 10449 and
tbl_computedBAseReportLogs.fk_tbl_baseReports_id = 108
group by tbl_computedBAseReportLogs.itemId,
DATEADD(wk, DATEDIFF(wk, 0, dateadd(hh, 0, tbl_computedBAseReportLogs.Date)), 0) order by nbCount desc
And here's the table structure:
CREATE TABLE [dbo].[tbl_computedBaseReportLogs](
[date] [smalldatetime] NOT NULL CONSTRAINT [DF_tbl_computedBaseReportLogs_date] DEFAULT (getdate()),
[fk_tbl_websites_id] [int] NOT NULL,
[fk_tbl_baseReports_id] [int] NOT NULL,
[itemId] [int] NOT NULL CONSTRAINT [DF_tbl_computedBaseReportLogs_itemId] DEFAULT ((0)),
[secondaryItemId] [int] NOT NULL CONSTRAINT [DF_tbl_computedBaseReportLogs_secondaryItemId] DEFAULT ((0)),
[nb] [int] NULL CONSTRAINT [DF_tbl_computedBaseReportLogs_nb] DEFAULT ((0)),
CONSTRAINT [PK_tbl_computedBaseReportLogs] PRIMARY KEY CLUSTERED
(
[date] ASC,
[fk_tbl_websites_id] ASC,
[fk_tbl_baseReports_id] ASC,
[itemId] ASC,
[secondaryItemId] ASC
)WITH (PAD_INDEX = OFF, IGNORE_DUP_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
What I'm am doing wrong? How could I get the query to run faster? How can I optimize the index?
Thanks for any idea
Stephane
January 22, 2008 at 7:30 pm
The PK Clustered Index you have might be good for something else, but it's no good for this query even though the execution plan say's you're using the clustered index... doesn't even have the NB column in it.
If you want this query to fly, you'll need to add the following index... takes a bit to add but tests on 10 million rows produces the correct return in less than a second...
CREATE INDEX NDX_tbl_computedBaseReportLogs1
ON dbo.tbl_computedBaseReportLogs
(fk_tbl_websites_id, Date, fk_tbl_baseReports_id, itemId, nb)
--Jeff Moden
Change is inevitable... Change for the better is not.
January 22, 2008 at 7:31 pm
Heh... I know what you're thinking and "No", do NOT change the order of the columns in the index I recommended 😉
--Jeff Moden
Change is inevitable... Change for the better is not.
January 22, 2008 at 8:43 pm
Awesome! I can't believe it works that fast!
Thank you very very much! 🙂
January 22, 2008 at 8:56 pm
Thanks for the feedback... as always, my test data may not be 100% the same as your situation... once you've tried the index, lemme know how it works... might be a bit of tweeking we need to do depending on the cardinality of each column.
--Jeff Moden
Change is inevitable... Change for the better is not.
April 12, 2008 at 1:24 pm
Hello!
I have been looking on the net to solve an issue and found this entry, which is so much like mine. Hope someone can help. I have a table that contains 1.5 million records and I have a simple select query with a just one filter. Pls find below the query and the table structure.
I think I'm missing something on the indexes. Right now, the exec plan for the query indicates a clustered index seek. But it runs for more than 3+ mins! the records returned are about 90,000.
This table does not have any keys. So I've created indexes on fields used for filter conditions. Especially for this query, I created a clustered index on the cSpecialty & id column. But that doesn't seem to help.
Query:
------
SELECT indicator, cProviderNo, cUPIN, cFirstName, cMiddleName, cLastName,
cnamesuffix, linnetwork, cSpecialty, cMasterNo, cName2, vendEffective AS dEffective,
vendTermination AS dTermination, txtaddress, cCity, cState, cZip, cPhone1,
cName1, cHIPAANPI
FROM ProviderLookup_2
WHERE cSpecialty = 'INTERNAL MEDICINE'
Table structure:
---------------
if exists (select * from dbo.sysobjects where id = object_id(N'[dbo].[ProviderLookup_2]') and OBJECTPROPERTY(id, N'IsUserTable') = 1)
drop table [dbo].[ProviderLookup_2]
GO
CREATE TABLE [dbo].[ProviderLookup_2] (
[Id] [int] IDENTITY (0, 1) NOT NULL ,
[indicator] [varchar] (1) NOT NULL ,
[cProviderNo] [varchar] (50) NULL ,
[cUPIN] [char] (10) NULL ,
[cFirstName] [char] (22) NULL ,
[cMiddleName] [char] (20) NULL ,
[cLastName] [varchar] (60) NULL ,
[cnamesuffix] [char] (20) NULL ,
[linnetwork] [bit] NULL ,
[cSpecialty] [varchar] (50) NULL ,
[cMasterNo] [char] (30) NULL ,
[cName2] [char] (60) NULL ,
[provdEffective] [datetime] NULL ,
[provdTermination] [datetime] NULL ,
[vendEffective] [datetime] NULL ,
[vendTermination] [datetime] NULL ,
[txtaddress] [text] NULL ,
[cCity] [char] (30) NULL ,
[cState] [char] (2) NULL ,
[cZip] [char] (9) NULL ,
[cPhone1] [char] (14) NULL ,
[cName1] [varchar] (75) NULL ,
[cHIPAANPI] [char] (10) NULL ,
[cTaxID] [char] (9) NULL ,
[cSSN] [char] (9) NULL ,
[Fullname] [varchar] (85) NULL
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
CREATE CLUSTERED INDEX [IND_id_cSpecialty] ON [dbo].[ProviderLookup_2]([cSpecialty], [Id]) WITH FILLFACTOR = 80 ON [PRIMARY]
GO
CREATE INDEX [ind_cProviderNo] ON [dbo].[ProviderLookup_2]([cProviderNo]) WITH FILLFACTOR = 100 ON [PRIMARY]
GO
CREATE INDEX [ind_cUPIN] ON [dbo].[ProviderLookup_2]([cUPIN]) WITH FILLFACTOR = 100 ON [PRIMARY]
GO
CREATE INDEX [ind_cLastName] ON [dbo].[ProviderLookup_2]([cLastName]) WITH FILLFACTOR = 100 ON [PRIMARY]
GO
CREATE INDEX [ind_Fullname] ON [dbo].[ProviderLookup_2]([Fullname]) WITH FILLFACTOR = 100 ON [PRIMARY]
GO
CREATE INDEX [ind_cMasterNo] ON [dbo].[ProviderLookup_2]([cMasterNo]) WITH FILLFACTOR = 100 ON [PRIMARY]
GO
CREATE INDEX [ind_provdTermination] ON [dbo].[ProviderLookup_2]([provdTermination]) WITH FILLFACTOR = 100 ON [PRIMARY]
GO
CREATE INDEX [ind_provdEffective] ON [dbo].[ProviderLookup_2]([provdEffective]) WITH FILLFACTOR = 100 ON [PRIMARY]
GO
CREATE INDEX [ind_cHIPAANPI] ON [dbo].[ProviderLookup_2]([cHIPAANPI]) WITH FILLFACTOR = 100 ON [PRIMARY]
GO
CREATE INDEX [ind_cTaxID] ON [dbo].[ProviderLookup_2]([cTaxID]) WITH FILLFACTOR = 100 ON [PRIMARY]
GO
CREATE INDEX [ind_cSSN] ON [dbo].[ProviderLookup_2]([cSSN]) WITH FILLFACTOR = 100 ON [PRIMARY]
GO
Any help would be much appreciated.
Thanks,
Suku
April 14, 2008 at 1:17 am
Are you using SQL 2005? If so, can you post the execution plan please (save as a .sqlplan file, zip and attach please)
My initial thoughts - 90 000 out of 1.5 milion is probably too high for a nonclustered index seek with a bookmark lookup. None of your NC indexes could cover the query.
I'm not sure why the clustered index didn't help though. Is there any blocking?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
April 14, 2008 at 10:37 am
Hi,
I'm using SQL 2000 and I didn't know how to save it as .sqlpan, so attaching screenshots of the exec plan. I will check how to do this.
It takes about more than 30 seconds now from office (From my aptmt it takes more time. Don't know why!) But, even 30 seconds is too long because, the fron-end has a timeout limit of 25 seconds.
From the exec plan, it looks like the clustered index seek is being issued, but it still does takes 30 seconds. How can we improve it so that it runs faster?
Rgds,
Suganya
April 14, 2008 at 11:09 am
why the id column? if you are only querying that one column just use it alone in the clustered index. i have tables where i have a int column for a generic PK and most times it's nonclustered because the queries are by date
April 14, 2008 at 11:21 am
Hi,
I had to go with the composite clustered index(cSpecialty, id) columns becuase, the cSpecialty column is not unique by itself, so had to join that with the auto increment field.
Rgds.
April 14, 2008 at 11:40 am
Ooohhh... not good. Just wait till you try a bunch of inserts that are close to each other for the cSpecialty column... it's a 4 letter word... WAIT....
--Jeff Moden
Change is inevitable... Change for the better is not.
April 14, 2008 at 11:43 am
suku (4/14/2008)
Hi,I had to go with the composite clustered index(cSpecialty, id) columns becuase, the cSpecialty column is not unique by itself, so had to join that with the auto increment field.
Rgds.
don't think the data in a clustered index has to be unique
we have them on data that is not unique. you are probably thinking of a primary key that creates a clustered index by default. if i'm lazy like i'm usually and use the GUI to create a table and schema i create the clustered index first and then the PK
April 14, 2008 at 12:04 pm
Jeff Moden (1/22/2008)
The PK Clustered Index you have might be good for something else, but it's no good for this query even though the execution plan say's you're using the clustered index... doesn't even have the NB column in it.If you want this query to fly, you'll need to add the following index... takes a bit to add but tests on 10 million rows produces the correct return in less than a second...
CREATE INDEX NDX_tbl_computedBaseReportLogs1
ON dbo.tbl_computedBaseReportLogs
(fk_tbl_websites_id, Date, fk_tbl_baseReports_id, itemId, nb)
Just curious - have you tried leveraging the actual INCLUDE clause in there? I would have gone for NB in the INCLUDE clause? (since the order by is on an aggregate of the field).
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
April 14, 2008 at 6:57 pm
Only if the tuning Wizard says so... haven't figured them well enough to do it on my own, yet... I guess it is a possibility here...
--Jeff Moden
Change is inevitable... Change for the better is not.
April 15, 2008 at 12:20 am
SQL Noob (4/14/2008)
don't think the data in a clustered index has to be unique
It doesn't have to be unique, but it it's not, SQL will make it unique behind the scenes. It does this by adding a int to the clustering key.
suku: Please post SQL 2000 questions in the SQL 2000 forums. By posting it here you're going to get lots of suggestions for things that don't work on SQL 2000 (like saving the plan and include columns), ultimatly wasting people's time.
Now, the query. From the looks of the pics you posted, that's about as fast as it can get. A covering NC index might be slightly better, but because of the text field you're retrieving and the number of columns that would be required in the index key, that isn't possible in SQL 2000
The Specialty column is not the best choice for a custered index key. Clustering keys should (IMHO) be narrow, very selective and non-changing. A Varchar(50) is not narrow and, from what you've said of the data distribution, not at all selective.
I can't recomend you a better clustering key without seeing the other queries that get run on the table a well as some indication of how data gets in the table.
My guess, at this point, it that the 40sec is the time required to get that data and send it to the client. If you're retrieving 92000 rows, with an average row size of 500 bytes. That means your query has to move 45 MB of data from the server, across the network and to the client. The probably reason for it being slower from your apartment is that you have less bandwidth there.
Is it really necessary that the front end gets the entire 92000 rows in one go? What's it going to do with that quantity of data?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
Viewing 15 posts - 1 through 15 (of 18 total)
You must be logged in to reply to this topic. Login to reply