Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 12»»

Ranking functions, views, and predicates Expand / Collapse
Author
Message
Posted Monday, March 18, 2013 12:52 PM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Thursday, April 17, 2014 7:33 AM
Points: 104, Visits: 385
So, I ran into a nifty little "bug" I'll call it. From what I can find this was an issue in SQL Server 2005 but supposedly addressed in SQL Server 2008. I'm running Microsoft SQL Server 2008 R2 (SP2) - 10.50.4260.0 (X64). Here is my scenario:

View is defined as follows:
CREATE VIEW dbo.vw_BatchItems
SELECT
BatchID,
FieldA,
FieldB,
CAST(ISNULL(ROW_NUMBER() OVER (PARTITION BY BatchID ORDER BY DateEntered), 0) AS INT) AS CheckOrder
FROM <myDatabase>.<mySchema>.<myTable>;

Now when I code a procedure to query the view and add a WHERE clause it looks like this:
CREATE VIEW dbo.usp_sel_BatchItems (@BatchID INT)
AS

SELECT *
FROM <myDatabase>.<mySchema>.vw_BatchItems
WHERE BatchID = @BatchID;

We found that this procedure now is in our top 10 (well, the top 1 of the top 10) I/O consuming procedures (the underlying tables are large). While the query still runs fast we found that the ROW_NUMBER is causing an index scan on the index over BatchID. When we remove the ranking function from the view the SELECT statement now does an index seek on the index over BatchID. And after reviewing both execution plans it clearly shows that the predicate is not pushed to the view when the ROW_NUMBER() function is in the view. Without it the predicate is pushed to the view.

Any thoughts on this one? Is there a way to force SQL server to perform the WHERE condition first instead of scanning the entire index to resolve the ROW_NUMBER() function? In our case, we only need the function to return results based on the filtered criteria anyway.
Post #1432313
Posted Monday, March 18, 2013 1:43 PM


SSC-Insane

SSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-Insane

Group: General Forum Members
Last Login: Today @ 7:07 AM
Points: 22,507, Visits: 30,223
What happens if you take this out of the call to ROW_NUMBER(): PARTITION BY BatchID. Seem unnecessary since you are calling it with a specific BatchID, unless you also need the the view for other things with all the BatchD's present.


You are probably getting the scan because of the PARTITION BY clause.



Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Post #1432324
Posted Monday, March 18, 2013 3:42 PM
Mr or Mrs. 500

Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500

Group: General Forum Members
Last Login: Monday, April 14, 2014 10:55 AM
Points: 532, Visits: 984
What about taking the row_number out of vw_BatchItems completely and putting row_number() over (order by DateEntered) in the usp_sel_BatchItems view?
Post #1432368
Posted Tuesday, March 19, 2013 8:14 AM


SSCarpal Tunnel

SSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal Tunnel

Group: General Forum Members
Last Login: Today @ 7:50 AM
Points: 4,128, Visits: 5,838
Erin Ramsay (3/18/2013)
What about taking the row_number out of vw_BatchItems completely and putting row_number() over (order by DateEntered) in the usp_sel_BatchItems view?


SQL Server 2008R2 build 4266

USE AdventureWorks
go
CREATE VIEW test1 as
SELECT
ProductID,
CarrierTrackingNumber,
UnitPrice,
CAST(ISNULL(ROW_NUMBER() OVER (PARTITION BY ProductID ORDER BY ModifiedDate), 0) AS INT) AS rownum
FROM Sales.SalesOrderDetail

2 records, INDEX seek, KEY lookup AS expected
SELECT *
FROM test1
WHERE ProductID = 897 --707

3083 records, table scan AS expected
SELECT *
FROM test1
WHERE ProductID = 707


Why are you doing the CAST ISNULL thing? Seems unnecessary to me, and this view avoids that "unnecessary" ComputeScalar in the plan:

CREATE VIEW test2 as
SELECT
ProductID,
CarrierTrackingNumber,
UnitPrice,
ROW_NUMBER() OVER (PARTITION BY ProductID ORDER BY ModifiedDate) AS rownum
FROM Sales.SalesOrderDetail

Given the above, it is possible your build of SQL Server has a bug/regression in it (in which case you could try taking out the CAST and see if that makes a difference and also eliminate the view and see if you get a seek). MUCH more likely IMHO is that the number of rows you get out of the query simply make the optimizer pick a scan as the most efficient access mechanism. Try forcing the index seek and see if it is more efficient. I note this could also be a simple case of parameter sniffing, where the first batchid called has a high estimated rows and gets the scan? Maybe not given your comment about with and without the ROW_NUMBER. Again please try without the CAST. Also, if you put the ROW_NUMBER in the sproc, what effect do you get? Also try OPTION(RECOMPILE on the statement in the sproc to see effect...


Best,

Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru at GMail
Post #1432673
Posted Tuesday, March 19, 2013 8:54 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Thursday, April 17, 2014 7:33 AM
Points: 104, Visits: 385
Lynn Pettis (3/18/2013)
What happens if you take this out of the call to ROW_NUMBER(): PARTITION BY BatchID. Seem unnecessary since you are calling it with a specific BatchID, unless you also need the the view for other things with all the BatchD's present.


You are probably getting the scan because of the PARTITION BY clause.


Lynn,

Unfortunately this isn't the case. Even with the PARTITION BY clause, SQL Server is still performing an index scan over the index and returning all the records in the table (as evidenced in the execution plan). I think this might be a bad usage of a ranking function (or possibly a bug in SQL Server for not pushing the predicate down into the view).
Post #1432706
Posted Tuesday, March 19, 2013 8:56 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Thursday, April 17, 2014 7:33 AM
Points: 104, Visits: 385
Erin Ramsay (3/18/2013)
What about taking the row_number out of vw_BatchItems completely and putting row_number() over (order by DateEntered) in the usp_sel_BatchItems view?


In this case that will not work as the code base uses this same view to provide an interface for the entity framework. The developer team uses this view in this context even though they don't call it directly. Otherwise that would work perfectly as we've even tested that. This is the primary reason why we are most likely going to just maintain a column to contain this information and scrap the usage of ROW_NUMBER() altogether in this case.
Post #1432708
Posted Tuesday, March 19, 2013 9:03 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Thursday, April 17, 2014 7:33 AM
Points: 104, Visits: 385
@TheSQLGuru,

Why the developer used ISNULL / CAST - not sure. Didn't make sense to me but even without those the execution plan remains the same. I'm also leary of suggesting using the RECOMPILE option a solution as that just reinforces bad coding practices IMO - unless it is the only justifiable option. But with that said, I did try this already :) and did not witness any changes to the execution plan.

And just to set the stage - I'm approaching this from an I/O perspective. We're moving to a new data center and one of the things we identified is we have a large volume of I/O. After researching our sql and reviewing various metrics we indeed to have some inefficient queries contributing to unnecessary I/O. So I'm simply looking at ways to cut this down.

Also, just to set the perspective, this table contains currently around 60,000 records and grows every day. The distribution of data based on the BatchID is between 1 and 180ish records (per BatchID). I would expect in all cases an index seek and not a scan through every single record in the table.
Post #1432715
Posted Tuesday, March 19, 2013 10:21 AM


SSCarpal Tunnel

SSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal Tunnel

Group: General Forum Members
Last Login: Today @ 7:50 AM
Points: 4,128, Visits: 5,838
1) what is estimated and actual row counts from sproc calls?

2) can you post both actual query plans - the one that gets the seek and the one that gets the scan? This one has definitely piqued my curiosity.



Best,

Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru at GMail
Post #1432764
Posted Tuesday, March 19, 2013 1:28 PM
SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: 2 days ago @ 4:53 PM
Points: 1,746, Visits: 2,553
Since the best plan will vary based on the selectivity of the specific @BatchID passed in, try it with the RECOMPILE option:

SELECT *
FROM <myDatabase>.<mySchema>.vw_BatchItems
WHERE BatchID = @BatchID
OPTION ( RECOMPILE );




SQL DBA,SQL Server MVP('07, '08, '09)
I'm not fat, I'm gravity challenged.
Post #1432876
Posted Tuesday, March 19, 2013 1:29 PM


Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Monday, April 07, 2014 7:35 AM
Points: 1,172, Visits: 2,413
tafountain (3/19/2013)
@TheSQLGuru,

Why the developer used ISNULL / CAST - not sure. Didn't make sense to me but even without those the execution plan remains the same. I'm also leary of suggesting using the RECOMPILE option a solution as that just reinforces bad coding practices IMO - unless it is the only justifiable option. But with that said, I did try this already :) and did not witness any changes to the execution plan.

And just to set the stage - I'm approaching this from an I/O perspective. We're moving to a new data center and one of the things we identified is we have a large volume of I/O. After researching our sql and reviewing various metrics we indeed to have some inefficient queries contributing to unnecessary I/O. So I'm simply looking at ways to cut this down.

Also, just to set the perspective, this table contains currently around 60,000 records and grows every day. The distribution of data based on the BatchID is between 1 and 180ish records (per BatchID). I would expect in all cases an index seek and not a scan through every single record in the table.


It's not outside the realm of possibility that the query optimizer decided that it's more efficient to scan a 60,000 row table than to perform a seek. The tipping point for that sort of thing tends to be much lower than people may think.

Also, you didn't post the definition of the index used by the seek when the ROW_NUMBER() function is removed from the view definition. I noticed that the ORDER BY argument for the ROW_NUMBER() function includes the column DateEntered that does not appear anywhere else in the query. Does the index on BatchID cover the query when this column is referenced (i.e., is DateEntered part of the key or an included column in that index)? If not, I think it is very likely that the optimizer would choose to scan a 60,000-row table rather than perform an index seek followed by a key lookup to get the DateEntered value.


Jason Wolfkill
Blog: SQLSouth
Twitter: @SQLSouth
Post #1432877
« Prev Topic | Next Topic »

Add to briefcase 12»»

Permissions Expand / Collapse