Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase «««12345»»

Query optimization help Expand / Collapse
Author
Message
Posted Thursday, April 17, 2014 10:19 AM
SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Today @ 9:09 AM
Points: 2,268, Visits: 3,429
Lynn Pettis (4/17/2014)
ScottPletcher (4/17/2014)
You need to do further research before you change the existing index to a filtered index, because that index will no longer satisfy queries originally using it if you filter out the values it was using.

Besides, I was hoping that the date range itself would eliminate enough rows to improve I/O significantly without impairing any existing uses of the index.


What date range? The query wants EVERYTHING prior to a specific date.



That's still a date range, from beginning date to the specified date. As I noted earlier, specific row counts would be necessary to do a full analysis, but coming in by date at least guarantees reducing the total number of rows that need searched.

Isn't if a far greater concern that willy-nilly changing an existing index to a filtered index could destroy the performance of a large number of existing queries that currently perform quite well?


SQL DBA,SQL Server MVP('07, '08, '09)

Carl Sagan said: "There is no such thing as a dumb question." Sagan obviously never watched a congressional hearing!
Post #1562752
Posted Thursday, April 17, 2014 10:28 AM


SSC-Insane

SSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-Insane

Group: General Forum Members
Last Login: Today @ 4:56 AM
Points: 20,815, Visits: 32,748
ScottPletcher (4/17/2014)
Lynn Pettis (4/17/2014)
ScottPletcher (4/17/2014)
You need to do further research before you change the existing index to a filtered index, because that index will no longer satisfy queries originally using it if you filter out the values it was using.

Besides, I was hoping that the date range itself would eliminate enough rows to improve I/O significantly without impairing any existing uses of the index.


What date range? The query wants EVERYTHING prior to a specific date.



That's still a date range, from beginning date to the specified date. As I noted earlier, specific row counts would be necessary to do a full analysis, but coming in by date at least guarantees reducing the total number of rows that need searched.

Isn't if a far greater concern that willy-nilly changing an existing index to a filtered index could destroy the performance of a large number of existing queries that currently perform quite well?


There is no specified start date. For all we know there could be 10+ years of data in this table and the query wants it all up to a specified point in time. As more data is added and the endpoint of this query progresses, more data will be pulled by the query.

And actually, I would create a new filtered index, not modify an existing index to be a filtered index. Modifying an existing index to be a filtered index could break other queries dependent on that index.



Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Post #1562757
Posted Thursday, April 17, 2014 10:37 AM
SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Today @ 9:09 AM
Points: 2,268, Visits: 3,429
Lynn Pettis (4/17/2014)
ScottPletcher (4/17/2014)
Lynn Pettis (4/17/2014)
ScottPletcher (4/17/2014)
You need to do further research before you change the existing index to a filtered index, because that index will no longer satisfy queries originally using it if you filter out the values it was using.

Besides, I was hoping that the date range itself would eliminate enough rows to improve I/O significantly without impairing any existing uses of the index.


What date range? The query wants EVERYTHING prior to a specific date.



That's still a date range, from beginning date to the specified date. As I noted earlier, specific row counts would be necessary to do a full analysis, but coming in by date at least guarantees reducing the total number of rows that need searched.

Isn't if a far greater concern that willy-nilly changing an existing index to a filtered index could destroy the performance of a large number of existing queries that currently perform quite well?


There is no specified start date. For all we know there could be 10+ years of data in this table and the query wants it all up to a specified point in time. As more data is added and the endpoint of this query progresses, more data will be pulled by the query.

And actually, I would create a new filtered index, not modify an existing index to be a filtered index. Modifying an existing index to be a filtered index could break other queries dependent on that index.


It's still a date range, from the starting date to a specified ending date. Not sure why you'd object to that. For all we know, they may run queries that specify and ending date where they need only one years' worth of data. If they need to read 10+ years' worth of data, they'll have to do that anyway.

And actually, you did suggest changing the existing index to filtered.

I don't believe it's worth the big overhead of an additional index on this large a table when you already have an index that, with a very small addition, can cover this query.

It was clear from earlier runs that other filters on the data didn't limit the data dramatically either. Given everything, with all known trade-offs considered, I'd rather have SQL read even a million more 15 bytes rows than create another index. Maybe if this query is to run frequently, and/or at peak/critical times, you might create a custom index. But otherwise, I don't see how such an index is worth its high cost.


SQL DBA,SQL Server MVP('07, '08, '09)

Carl Sagan said: "There is no such thing as a dumb question." Sagan obviously never watched a congressional hearing!
Post #1562768
Posted Thursday, April 17, 2014 10:44 AM


SSC-Insane

SSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-Insane

Group: General Forum Members
Last Login: Today @ 4:56 AM
Points: 20,815, Visits: 32,748
ScottPletcher (4/17/2014)
Lynn Pettis (4/17/2014)
ScottPletcher (4/17/2014)
Lynn Pettis (4/17/2014)
ScottPletcher (4/17/2014)
You need to do further research before you change the existing index to a filtered index, because that index will no longer satisfy queries originally using it if you filter out the values it was using.

Besides, I was hoping that the date range itself would eliminate enough rows to improve I/O significantly without impairing any existing uses of the index.


What date range? The query wants EVERYTHING prior to a specific date.



That's still a date range, from beginning date to the specified date. As I noted earlier, specific row counts would be necessary to do a full analysis, but coming in by date at least guarantees reducing the total number of rows that need searched.

Isn't if a far greater concern that willy-nilly changing an existing index to a filtered index could destroy the performance of a large number of existing queries that currently perform quite well?


There is no specified start date. For all we know there could be 10+ years of data in this table and the query wants it all up to a specified point in time. As more data is added and the endpoint of this query progresses, more data will be pulled by the query.

And actually, I would create a new filtered index, not modify an existing index to be a filtered index. Modifying an existing index to be a filtered index could break other queries dependent on that index.


It's still a date range, from the starting date to a specified ending date. Not sure why you'd object to that. For all we know, they may run queries that specify and ending date where they need only one years' worth of data. If they need to read 10+ years' worth of data, they'll have to do that anyway.

And actually, you did suggest changing the existing index to filtered.

I don't believe it's worth the big overhead of an additional index on this large a table when you already have an index that, with a very small addition, can cover this query.

It was clear from earlier runs that other filters on the data didn't limit the data dramatically either. Given everything, with all known trade-offs considered, I'd rather have SQL read even a million more 15 bytes rows than create another index. Maybe if this query is to run frequently, and/or at peak/critical times, you might create a custom index. But otherwise, I don't see how such an index is worth its high cost.



The only way you can interpret the following as modifying an existing index is by the name. No where in the post did I say modify an index to be a filtered index and the code provided would fail if the index name already existed:

Lynn Pettis (4/16/2014)
As it looks like you are using SQL Server 2008 perhaps a filtered index. Not sure if it will help, but it is worth testing.


CREATE NONCLUSTERED INDEX [AI_OperatorCBGEstimate_14947]
ON dbo.OperatorCBGEstimate (DateEndedStandard)
INCLUDE (KeyInstn)
WHERE MostRecentSequence = 1 and UpdOperation < 2







Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Post #1562774
Posted Thursday, April 17, 2014 9:43 PM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Today @ 9:35 AM
Points: 7,874, Visits: 9,613
EasyBoy (4/16/2014)
Thanks Scott and Lynn for your inputs.
As suggested by Scott i have made changes in existing index with filtered index (suggested by Lynn). And i am able to bring the logical reads down to 30k from 70k.

CREATE NONCLUSTERED INDEX [AI_OperatorCBGEstimate_14947] 
ON OperatorCBGEstimate ( DateEndedStandard, UpdOperation )
INCLUDE ( KeyInstn, MostRecentSequence )
WHERE MostRecentSequence = 1 and UpdOperation < 2 WITH ( DROP_EXISTING = ON )


I may be confused here, but if I've understood correctly the optimiser might be able to do somewhat better with an index like this:-
[cpde="sql"]CREATE NONCLUSTERED INDEX [AI_OperatorCBGEstimate_14947]
ON OperatorCBGEstimate ( KeyInstn, DateEndedStandard )
WHERE MostRecentSequence = 1 and UpdOperation < 2 WITH ( DROP_EXISTING = ON )[/code]
It won't get the logical reads down any more than the one you tried, but it mght reduce other costs.


Tom
Post #1562906
Posted Friday, April 18, 2014 7:05 AM


SSC-Insane

SSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-Insane

Group: General Forum Members
Last Login: Today @ 4:56 AM
Points: 20,815, Visits: 32,748
TomThomson (4/17/2014)
EasyBoy (4/16/2014)
Thanks Scott and Lynn for your inputs.
As suggested by Scott i have made changes in existing index with filtered index (suggested by Lynn). And i am able to bring the logical reads down to 30k from 70k.

CREATE NONCLUSTERED INDEX [AI_OperatorCBGEstimate_14947] 
ON OperatorCBGEstimate ( DateEndedStandard, UpdOperation )
INCLUDE ( KeyInstn, MostRecentSequence )
WHERE MostRecentSequence = 1 and UpdOperation < 2 WITH ( DROP_EXISTING = ON )


I may be confused here, but if I've understood correctly the optimiser might be able to do somewhat better with an index like this:-
CREATE NONCLUSTERED INDEX [AI_OperatorCBGEstimate_14947] 
ON OperatorCBGEstimate ( KeyInstn, DateEndedStandard )
WHERE MostRecentSequence = 1 and UpdOperation < 2 WITH ( DROP_EXISTING = ON )

It won't get the logical reads down any more than the one you tried, but it mght reduce other costs.


Sort of like the one I posted earlier except I only indexes on DateEndedStandard and made KeyInstn an included column.. Based on the query if I were to index on both columns I would swap them around from what you have listed.




Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Post #1563011
Posted Friday, April 18, 2014 4:44 PM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Today @ 9:35 AM
Points: 7,874, Visits: 9,613
Lynn Pettis (4/18/2014)
TomThomson (4/17/2014)
I may be confused here, but if I've understood correctly the optimiser might be able to do somewhat better with an index like this:-
CREATE NONCLUSTERED INDEX [AI_OperatorCBGEstimate_14947] 
ON OperatorCBGEstimate ( KeyInstn, DateEndedStandard )
WHERE MostRecentSequence = 1 and UpdOperation < 2 WITH ( DROP_EXISTING = ON )

It won't get the logical reads down any more than the one you tried, but it mght reduce other costs.


Sort of like the one I posted earlier except I only indexes on DateEndedStandard and made KeyInstn an included column.. Based on the query if I were to index on both columns I would swap them around from what you have listed.


Yes, in fact I started from the one you posted. I think making the item being grouped on they first key in the index may reduce the hashing cost - with the index elements in that order the optimiser knows that it will get consecutively biggish bunches of rows with the same KeyInstn, and may be capable of deciding not to compute the hash in the case where it's going to be the same as the previous one. Although I don't know enough about sql server's optimiser to say whether it will or won't do that, I do know that some optimisers decades ago did so it seems possible that SQL Server does. I was cLutching at straws here, of course, it is definitely a try it and see thing not a firm recommendation.


Tom
Post #1563200
Posted Tuesday, April 22, 2014 1:57 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Yesterday @ 6:52 AM
Points: 6,872, Visits: 14,185
TomThomson (4/18/2014)
Lynn Pettis (4/18/2014)
TomThomson (4/17/2014)
I may be confused here, but if I've understood correctly the optimiser might be able to do somewhat better with an index like this:-
CREATE NONCLUSTERED INDEX [AI_OperatorCBGEstimate_14947] 
ON OperatorCBGEstimate ( KeyInstn, DateEndedStandard )
WHERE MostRecentSequence = 1 and UpdOperation < 2 WITH ( DROP_EXISTING = ON )

It won't get the logical reads down any more than the one you tried, but it mght reduce other costs.


Sort of like the one I posted earlier except I only indexes on DateEndedStandard and made KeyInstn an included column.. Based on the query if I were to index on both columns I would swap them around from what you have listed.


Yes, in fact I started from the one you posted. I think making the item being grouped on they first key in the index may reduce the hashing cost - with the index elements in that order the optimiser knows that it will get consecutively biggish bunches of rows with the same KeyInstn, and may be capable of deciding not to compute the hash in the case where it's going to be the same as the previous one. Although I don't know enough about sql server's optimiser to say whether it will or won't do that, I do know that some optimisers decades ago did so it seems possible that SQL Server does. I was cLutching at straws here, of course, it is definitely a try it and see thing not a firm recommendation.


It's not clutching at straws at all, but there's a caveat, and a balancing act too. I've set up a model based on the information provided by the OP, and in the model this index does give rise to a streaming aggregate instead of a hash aggregate. The caveat is this - the optimiser will only choose it if there isn't an index available with DateEndedStandard as the leading edge. And the balancing act is this: even with a filtered index, if KeyInstn is the first key (to encourage the streaming aggregate) then the DateEndedStandard predicate is residual and hence relatively expensive. If DateEndedStandard is the leading edge then there's no filter, just a seek (range scan?) but at the expense of a hash aggregate.


“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
Exploring Recursive CTEs by Example Dwain Camps
Post #1563713
Posted Tuesday, April 22, 2014 7:03 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Yesterday @ 6:52 AM
Points: 6,872, Visits: 14,185
EasyBoy (4/16/2014)
ChrisM@Work (4/16/2014)
EasyBoy (4/15/2014)
I did try almost all combination of altering columns for index, but somehow logical reads are same (67000).


Did you look at the execution plans to see if the hash aggregate was replaced with a stream aggregate? If you're not sure, then create the index which Gail suggests and post the execution plan here.


Please find the updated execution plan.


Can anyone reading this thread account for the columns in the defined values list or output list of the index seek operator on this plan? Here they are:

[InternalUseOnly_New].[dbo].[OperatorCBGEstimate].KeyInstn,
[InternalUseOnly_New].[dbo].[OperatorCBGEstimate].DateEndedStandard,
Uniq1002,
[InternalUseOnly_New].[dbo].[OperatorCBGEstimate].BlockGroupFIPS,
[InternalUseOnly_New].[dbo].[OperatorCBGEstimate].UpdOperation,
[InternalUseOnly_New].[dbo].[OperatorCBGEstimate].DateEstimated

Only the first two columns are required by the next operator (Stream Aggregate), and three of the remaining four aren't even referenced by the query. What little literature I have doesn't come close and I can't find anything useful with Boogle either.
Thanks.


“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
Exploring Recursive CTEs by Example Dwain Camps
Post #1563832
Posted Tuesday, April 22, 2014 8:22 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Yesterday @ 6:52 AM
Points: 6,872, Visits: 14,185
A little more info - this is the same question on MSDN:

I have one simple query to optimize but somehow after creating couple of index this query is still having more logical reads and time.

DECLARE @CURRENT smalldatetime
set @current = '6/30/2013'

select KeyInstn, MAX(dateendedstandard) DateEndedStandard
from InternalEstimate with (index(AI_OperatorCBGEstimate_HAP))
where dateendedstandard < @Current and mostrecentsequence = 1 and UpdOperation<2
group by KeyInstn


I have created index as
CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>]
ON [dbo].[OperateEstimate] ([MostRecentSequence],[DateEndedStandard],[UpdOperation])
INCLUDE ([KeyInstn])



Is that two different tables or is one of them a view?


“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
Exploring Recursive CTEs by Example Dwain Camps
Post #1563876
« Prev Topic | Next Topic »

Add to briefcase «««12345»»

Permissions Expand / Collapse