Custom Pagination in SQL Server 2005

Question

Custom Pagination in SQL Server 2005

Viewing 15 posts - 46 through 60 (of 69 total)

You must be logged in to reply to this topic. Login to reply

phystech SSC Veteran Points: 263 More actions · Answer 1

Adam Haines (1/6/2009)
WHERE CASE
WHEN @FirstName IS NOT NULL THEN
CASE WHEN FirstName = @FirstName THEN 1
ELSE 0 END
ELSE 1 END = 1
---
instead of
---
WHERE [FirstName] = COALESCE(@FirstName,FirstName)
---
The difference is that column name "FirstName" is not involved in
"CASE
WHEN @FirstName IS NOT NULL THEN"
Thus we have some sort of short circuiting preventing from the full table scan (At least I beleive so).
I believe this will still cause an index scan/table scan. The problem I see with this is the variable in the predicate, which means the optimizer will have to compare the variable with the column value for each row to determine if it should be given the value of 1.

The optimizer knows that

"CASE WHEN @FirstName IS NOT NULL THEN"

can be evaluated before the query is launched, since no column name is involved here and the resulting value could be either 0 or 1. So it can branch the plan into two subtrees. One subtree would result in logical "true" right at this point (no @FirstName was supplied) and the second subtree would go evaluating "FirstName = @FirstName " using the index, that is very quickly.

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 2

phystech (1/6/2009)
Adam Haines (1/6/2009)
WHERE CASE
WHEN @FirstName IS NOT NULL THEN
CASE WHEN FirstName = @FirstName THEN 1
ELSE 0 END
ELSE 1 END = 1
---
instead of
---
WHERE [FirstName] = COALESCE(@FirstName,FirstName)
---
The difference is that column name "FirstName" is not involved in
"CASE
WHEN @FirstName IS NOT NULL THEN"
Thus we have some sort of short circuiting preventing from the full table scan (At least I beleive so).
I believe this will still cause an index scan/table scan. The problem I see with this is the variable in the predicate, which means the optimizer will have to compare the variable with the column value for each row to determine if it should be given the value of 1.
The optimizer knows that
"CASE WHEN @FirstName IS NOT NULL THEN"
can be evaluated before the query is launched, since no column name is involved here and the resulting value could be either 0 or 1. So it can branch the plan into two subtrees. One subtree would result in logical "true" right at this point (no @FirstName was supplied) and the second subtree would go evaluating "FirstName = @FirstName " using the index, that is very quickly.

Lets assume this is true (I did not test it so won't state yea or nay on that). What you WILL get is a cached plan with whatever gets executed first. That guarantees that other calls that want the opposite plan will be completely suboptimal in performance. If you are on SQL 2005 you can use OPTION (RECOMPILE) on the statement to help out, but that still assumes that the optimizer can and does do the right thing with whatever inputs are given, which I question.

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

phystech SSC Veteran Points: 263 More actions · Answer 3

TheSQLGuru (1/7/2009)

phystech (1/6/2009)

Adam Haines (1/6/2009)
The optimizer knows that
"CASE WHEN @FirstName IS NOT NULL THEN"
can be evaluated before the query is launched, since no column name is involved here and the resulting value could be either 0 or 1. So it can branch the plan into two subtrees. One subtree would result in logical "true" right at this point (no @FirstName was supplied) and the second subtree would go evaluating "FirstName = @FirstName " using the index, that is very quickly.
Lets assume this is true (I did not test it so won't state yea or nay on that). What you WILL get is a cached plan with whatever gets executed first. That guarantees that other calls that want the opposite plan will be completely suboptimal in performance. If you are on SQL 2005 you can use OPTION (RECOMPILE) on the statement to help out, but that still assumes that the optimizer can and does do the right thing with whatever inputs are given, which I question.

I've been using the approach within rather complex query for some months. And my strong impression is that the short-circuiting works. Though I don't short-circuit it on NULL value, I do it on zero string length, but this should not be different. I'll try to test the approach "stand-alone" within a week and publish on my blog.

Thanks.

Anipaul SSC-Insane Points: 24681 More actions · Answer 4

Anipaul

SSC-Insane

Points: 24681

January 7, 2009 at 10:21 am

#922709

Nice one.. The discussion also interesting..

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 5

I've been using the approach within rather complex query for some months. And my strong impression is that the short-circuiting works. Though I don't short-circuit it on NULL value, I do it on zero string length, but this should not be different. I'll try to test the approach "stand-alone" within a week and publish on my blog.

Thanks.

If you never passed in a NULL value then the CASE is surperfluous, right? You will never get (or need) a plan that executes the ELSE. I will say that it would be VERY helpful if this construct gives an index seek when appropriate and a table scan when it is not, but there will still be the cached plan issue to deal with.

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

phystech SSC Veteran Points: 263 More actions · Answer 6

TheSQLGuru (1/7/2009)
...If you never passed in a NULL value then the CASE is surperfluous, right?..

No, zero length string plays the role of NULL. That is if the parameter is zero length then grab all the records (sort of).

Adam Haines SSC-Insane Points: 23217 More actions · Answer 7

phystech (1/7/2009)

TheSQLGuru (1/7/2009)

phystech (1/6/2009)
Adam Haines (1/6/2009)
The optimizer knows that
"CASE WHEN @FirstName IS NOT NULL THEN"
can be evaluated before the query is launched, since no column name is involved here and the resulting value could be either 0 or 1. So it can branch the plan into two subtrees. One subtree would result in logical "true" right at this point (no @FirstName was supplied) and the second subtree would go evaluating "FirstName = @FirstName " using the index, that is very quickly.
Lets assume this is true (I did not test it so won't state yea or nay on that). What you WILL get is a cached plan with whatever gets executed first. That guarantees that other calls that want the opposite plan will be completely suboptimal in performance. If you are on SQL 2005 you can use OPTION (RECOMPILE) on the statement to help out, but that still assumes that the optimizer can and does do the right thing with whatever inputs are given, which I question.
I've been using the approach within rather complex query for some months. And my strong impression is that the short-circuiting works. Though I don't short-circuit it on NULL value, I do it on zero string length, but this should not be different. I'll try to test the approach "stand-alone" within a week and publish on my blog.
Thanks.

Phystech,

Can you confirm that the code you are posting actually generates index seeks? I still do not see how it is possible. I wan to step through the code you posted and demonstrate how the optimizer will treat each section.

WHERE CASE

WHEN @FirstName IS NOT NULL THEN

CASE WHEN FirstName = @FirstName THEN 1

ELSE 0 END

ELSE 1 END = 1 and

The first (outer) case expression does exactly what you say it will do. If the variable is null, the filter is short circuited. I agree with this. It is the inner case expression that I have a problem with. The inner case expression compares each FirstName value in the table to the variable @FirstName. I am sure we can both agree on this; however, I believe that the optimizer is treating this differently than you are expecting. The inner case expression will have to be evaluated for every row returned by the query. The optimizer has no way of knowing whether or not the FirstName Column equals @FirstName until it checks each row returned. You can confirm this by using your code in my initial stored procedure, with your modification.

WHERE CASE

WHEN @FirstName IS NOT NULL THEN

CASE WHEN FirstName = @FirstName THEN 1

ELSE 0 END

ELSE 1 END = 1 and

CASE

WHEN @LastName IS NOT NULL THEN

CASE WHEN LastName = @LastName THEN 1

ELSE 0 END

ELSE 1 END = 1

You will see no matter what combination of FirstName and LastName is used the end result is the same. The behavior is the same regardless of the case expression being in the select statement or the predicate. This behavior occurs because the optimizer has to look through each returned rows and evaluate FirstName to see if it matches the variable. The optimizer cannot seek a set of rows with the case expression and you cannot short circuit the inner case statement because you can potentially exclude rows. However, there is a small catch that may be covering up the actual behavior, in your environment. if you apply a differing filter that is always present, the case expressions behavior can be hidden away in another execution plan task. For example, you use your code in the where clause on firstname and use LastName like @LastName + '%', for lastName. You will see that the scan is gone. The scan is gone because the optimizer was able to use an index and then can simple apply the case express on each of the returned rows. The bottom line is you will always get an index scan using the logic you have posted, unless you use some other filter that gaurentees a seek. I would still recommend avoiding case expressions in the where clause, but to each is own.

My blog: http://jahaines.blogspot.com

phystech SSC Veteran Points: 263 More actions · Answer 8

Adam Haines (1/7/2009)
...Can you confirm that the code you are posting actually generates index seeks?...

I need a couple of days to prove/disprove it, right now I'm working on my current project. I'll post here then.

Thanks.

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 9

phystech (1/7/2009)
TheSQLGuru (1/7/2009)
...If you never passed in a NULL value then the CASE is surperfluous, right?..
No, zero length string plays the role of NULL. That is if the parameter is zero length then grab all the records (sort of).

Your "sort of" is a critical statement!

WHERE CASE

WHEN @FirstName IS NOT NULL THEN

CASE WHEN FirstName = @FirstName THEN 1

ELSE 0 END

ELSE 1 END = 1

The ELSE 1 will never be hit, thus the entire statement can be simplified to this (since you never have a NULL value passed in):

WHERE CASE WHEN FirstName = @FirstName THEN 1

ELSE 0 END = 1

and that, I believe, is no different logically than this:

WHERE FirstName = @FirstName

in which case you will obviously do an index seek if the statistics estimate 1-2% of the rows will be hit based on the incoming value of @FirstName.

And we are back to the parameter sniffing plan caching problem. :w00t:

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

phystech SSC Veteran Points: 263 More actions · Answer 10

TheSQLGuru (1/7/2009)
phystech (1/7/2009)
TheSQLGuru (1/7/2009)
...If you never passed in a NULL value then the CASE is surperfluous, right?..
No, zero length string plays the role of NULL. That is if the parameter is zero length then grab all the records (sort of).
Your "sort of" is a critical statement!
WHERE CASE
WHEN @FirstName IS NOT NULL THEN
CASE WHEN FirstName = @FirstName THEN 1
ELSE 0 END
ELSE 1 END = 1
The ELSE 1 will never be hit, thus the entire statement can be simplified to this (since you never have a NULL value passed in):
WHERE CASE WHEN FirstName = @FirstName THEN 1
ELSE 0 END = 1
and that, I believe, is no different logically than this:
WHERE FirstName = @FirstName
in which case you will obviously do an index seek if the statistics estimate 1-2% of the rows will be hit based on the incoming value of @FirstName.
And we are back to the parameter sniffing plan caching problem. :w00t:

Please, don't take it too literally I do not compare it to NULL:

WHERE CASE

WHEN LEN(@FirstName) > 0 THEN

CASE WHEN FirstName = @FirstName THEN 1

ELSE 0 END

ELSE 1 END = 1

Adam Haines SSC-Insane Points: 23217 More actions · Answer 11

TheSQLGuru (1/7/2009)
phystech (1/7/2009)
TheSQLGuru (1/7/2009)
...If you never passed in a NULL value then the CASE is surperfluous, right?..
No, zero length string plays the role of NULL. That is if the parameter is zero length then grab all the records (sort of).
Your "sort of" is a critical statement!
WHERE CASE
WHEN @FirstName IS NOT NULL THEN
CASE WHEN FirstName = @FirstName THEN 1
ELSE 0 END
ELSE 1 END = 1
The ELSE 1 will never be hit, thus the entire statement can be simplified to this (since you never have a NULL value passed in):
WHERE CASE WHEN FirstName = @FirstName THEN 1
ELSE 0 END = 1
and that, I believe, is no different logically than this:
WHERE FirstName = @FirstName
in which case you will obviously do an index seek if the statistics estimate 1-2% of the rows will be hit based on the incoming value of @FirstName.
And we are back to the parameter sniffing plan caching problem. :w00t:

Unfortunately, this still should yield a index scan. The case expression is still evaluated for every row, as the optimizer doesnt know if the variable matches the case expression, until it evaluates it. For this reason the optimizer cannot seek and much evaluate each FirstName.

Here is a simple example.

select firstname

from contacts

WHERE

CASE WHEN FirstName = @FirstName

THEN 1

ELSE 0

END = 1

QueryPlan:

|--Index Scan(OBJECT:([tempdb].[dbo].[Contacts].[IXC_FirstName]), WHERE:(CASE WHEN [tempdb].[dbo].[Contacts].[FirstName]=[@firstname] THEN (1) ELSE (0) END=(1)))

My blog: http://jahaines.blogspot.com

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 12

Which is why I was saying it should be decompiled down to it's logical equivalent, which will do the right thing:

create table #t (fname varchar(20))

insert #t (fname)

select top 100 'asdf'

from sysobjects

insert #t (fname) values ('zzzz')

create index a on #t (fname)

declare @fname varchar(20)

set @fname = 'asdf' --does index scan

select *

from #t

WHERE

CASE WHEN fname = @fname

THEN 1

ELSE 0

END = 1

dbcc freeproccache

go

declare @fname varchar(20)

set @fname = 'zzzz' --does index scan as well, even though only 1 row

select *

from #t

WHERE

CASE WHEN fname = @fname

THEN 1

ELSE 0

END = 1

dbcc freeproccache

go

declare @fname varchar(20)

set @fname = 'asdf' --does index seek, since covering and ordered

select fname

from #t

WHERE fname = @fname

dbcc freeproccache

go

declare @fname varchar(20)

set @fname = 'zzzz' --does index seek, since 1 row

select fname

from #t

WHERE fname = @fname

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

peter-757102 SSCertifiable Points: 6877 More actions · Answer 13

First I am back a day later then I expected and definatly had hoped for, sorry for that. And please someone FIX this forum (smileys in code, persistent extended page widths even after I reformatted the code to fit).

Now my 'paging' code that I said would contribute. Bear in mind I work in an enviromment that is able to send parameterized queries from the application layer to the database. The SQL code itself originates from the application layer most of the time where queries are created dynamically with relativly ease.

What I created a while back was a method of reducing the SQL complexity when dealing with paging on the application side. For it to work the main query needs to be written in a CTE and return a sequence number that can be sorded on in the actual query. It is also used by the paging logic in this query to exclude unwanted records among other things.

At the end of the normal query (now wrapped in a CTE), an application function is called to generate the paging part of the query for the developer automatically. The part deals with limiting the amount of data is transmidded by SQL sever to the appliation sever. The code also takes care of out of bounds cases and adjusts the page number accordingly. On top of this an optional part generates a comma seperated list of primary key IDs of all the records in the result, including the ones not returned due to paging.

First we need some demo data (10 000 records should do):

-- Create demo table with primary key

--

create table dbo.DemoTable

(

Id int not null

, Name varchar(36) not null

, SomeData varbinary(20) not null

, constraint pk_DemoTable primary key clustered( ID )

)

;

go

-- Insert 10.000 test records into the demo table

--

insert into dbo.DemoTable( Id, Name, SomeData )

select

numbers.N

, newID()

, hashBytes( 'SHA1', cast( numbers.N as varchar ) )

from

(

select top 10000

row_number() over ( order by t1.object_id )

from

master.sys.All_Columns as t1 with( nolock )

cross join master.sys.All_Columns as t2 with( nolock )

) as numbers( N )

go

-- Index the demo table

--

create index ix_DemoTable_Name on dbo.DemoTable ( Name );

create index ix_DemoTable_SomeData on dbo.DemoTable ( SomeData );

go

-- Show the first few records to get a feel of what the data looks like

--

select top 10 * from dbo.DemoTable where Id <= 10

go

Next is a demo query itself that demonstrates the techniques used and makes clear how easy the results can be processed further:

-- For demo purposes only: these parameters are normaly injected into

-- the paging secopn of the query as constants

--

declare @pgNum int;

declare @pgSize int;

select @pgNum = 300, @pgSize = 25;

;

with

/* This code is put into the page that shows the grid */

pgFromQ as

(

select

row_number() over

(

-- Order by, to be generated by application code so grid sorting can change over time

order by

demo.Name desc

, demo.Id asc -- ALWAYS include the primary key as the last field when sorting

) as 'pgRowNumber'

, demo.Id

, demo.Name

, demo.SomeData

from

dbo.DemoTable demo

)

Then the developer calls a function that generates the rest of the query.

It requires a few parameters:

1: Page to show (4)

2: Page size (25)

3: Optional field name ("Id" in this example).

When specified it causes a comma seperated list to be generated of all the IDs.

This can be used to 'browse' over the results in the any detail screen.

It looks like (in whatever language you use):

pagerSQL( 4, 25, "Id" )

This code then generates the remainder of the query (2 parts). The first part is some more CTEs that handle page clipping and the like.

, pgCountQ( recordCount, pageCount ) as

(

-- determines the number of pages

select top 1

count(*)

, floor( (count(*) + @pgSize - 1) / @pgSize )

from

pgFromQ

)

, pgPageQ( activePage, recordCount, pageCount ) as

(

select top 1

case

when @pgNum < 1

then 1

when @pgNum > pageCount

then pageCount

else @pgNum

end

, recordCount

, pageCount

from

pgCountQ

)

, pgRangeQ( firstRecord, activePage, recordCount, pageCount ) as

(

select top 1

case

when activePage = 0

then 0

else ( @pgSize * ( activePage - 1 ) + 1 )

end

, activePage

, recordCount

, pageCount

from

pgPageQ

)

-- this is the optional part that is generates the comma separated list of a key field ("Id" in this example)

, pgRecListQ( recIDs ) as

(

select

stuff

(

select

cast( ',' as varchar(max) ) + cast( i.Id as varchar )

from

pgFromQ as i

order by

i.pgRowNumber

for

xml path('')

)

, 1

, ''

)

-- end of optional part

The second part of the generated code performs the actual query and makes sure the least amount of data is returned by SQL.

/*

-- And finaly the select that returns the result set (only the first record of the resulting page

-- contains extra such as number of first record, the active page number, the recordcount and the pagecount.

-- And inf choosen also the optional comman separated list of IDs that represent the non-paged result.

*/

select

case when pgRangeQ.recordCount = 0 or pgFromQ.pgRowNumber = pgRangeQ.firstRecord then pgRangeQ.firstRecord else null end as 'pgFirstRecord'

, case when pgRangeQ.recordCount = 0 or pgFromQ.pgRowNumber = pgRangeQ.firstRecord then pgRangeQ.activePage else null end as 'pgActivePage'

, case when pgRangeQ.recordCount = 0 or pgFromQ.pgRowNumber = pgRangeQ.firstRecord then pgRangeQ.recordCount else null end as 'pgRecordCount'

, case when pgRangeQ.recordCount = 0 or pgFromQ.pgRowNumber = pgRangeQ.firstRecord then pgRangeQ.pageCount else null end as 'pgPageCount'

-- this is the optional part that is returns the comma separated list of a key field ("Id" in this example)

, case when pgFromQ.pgRowNumber = pgRangeQ.firstRecord then ( select top 1 recIDs from pgRecListQ ) else null end as 'pgRecordIDs'

-- end of optional part

, pgFromQ.*

from

pgRangeQ

left join pgFromQ on pgFromQ.pgRowNumber between pgRangeQ.firstRecord and pgRangeQ.firstRecord + @pgSize - 1

order by

pgFromQ.pgRowNumber

;

The automaticaly generated code might seem complicated at first but once understood it is not that bad, nor is it expensive. It is nicely hidden from the functionality of the original query and it makes processing the results easy and efficient. The only times I experienced bad performance was when testing with generation of the optional comma separated list over say 100 000 records and when the original/main query was slow. It is always a matter of using the right design for the job, so that ain't a negative for the method itself. It certainly beats all the paging I ever seen before in applications on speed, transmission efficiency and usability.

phystech SSC Veteran Points: 263 More actions · Answer 14

Adam Haines (1/7/2009) ...Unfortunately, this still should yield a index scan. The case expression is still evaluated for every row, as the optimizer doesnt know if the variable matches the case expression, until it evaluates it. For this reason the optimizer cannot seek and much evaluate each FirstName...

You seem to be right. I created a table in MS SQL Managament Studio

CREATE TABLE GUIDs(

GUID uniqueidentifier

)

CREATE INDEX IX_GUID ON GUIDs(GUID)

Populated it with 10 million records with unique GUID in each record. And then tried to fetch an existent/non-existent record from the table using

- direct

WHERE GUID = @GUID

- COALESCE

WHERE GUID = COALESCE(@GUID,GUID)

- and CASE:

WHERE CASE

WHEN @GUID IS NOT NULL THEN

CASE WHEN GUID = @GUID THEN 1

ELSE 0 END

ELSE 1 END = 1

For example:

DECLARE @GUID uniqueidentifier

SET @GUID = '093cd7ed-db22-40fb-9f77-5be0e33aaaaa'

SELECT * FROM GUIDS

WHERE CASE

WHEN @GUID IS NOT NULL THEN

CASE WHEN GUID = @GUID THEN 1

ELSE 0 END

ELSE 1 END = 1

Yes, direct "WHERE GUID = @GUID" fetched the record in no time

"COALESCE" and "CASE" seemed to load the table in memory (~400 Mbyte) on the first run and then full-scanned the table. The loading took about 5 secs and the scan took ~1 sec. The PC was Core Duo 3.16 with 4G memory under Vista 64 Home.

Yes, I did "Display estimated execution plan" in SQL Managament Studio and saw "table scan cost 100%". I just did not beleive my eyes the optimizer was that stupid (in my opinion).

So, my apologies, and thanks for the oportunity to gain the knowledge.

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 15

Peter, I couldn't really follow the code you had. Was it supposed to be just separate batch files or was some/all of it supposed to be wrapped up in sprocs/functions/etc? Any chance you can attach it as a zip of appropriate file types?

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

Custom Pagination in SQL Server 2005

Cookies on SQLServerCentral