Returning row count as output in CTE

Question

Returning row count as output in CTE

brendan woulfe

Hall of Fame

Points: 3888
More actions
August 15, 2011 at 2:09 pm

#241773

I'm doing something similar to the below example:
DECLARE @RowStart INT
DECLARE @RowEnd INT
SET @RowStart = 1
SET @RowEnd = 100
;WITH CTE AS
(
SELECT a,
b,
c,
ROW_NUMBER() OVER(ORDER BY c) AS RowNumber
FROM table1
)
SELECT a,
b,
c
FROM CTE
WHERE RowNumber BETWEEN @RowStart AND @RowEnd
I had this working fine as a stored procedure until I was told I needed to output the total row count, or the number of records that would have been returned with no paging. I searched and searched and could not find a solution that worked. I'm not able to set the the total row count and return the data from the select statement when using a CTE. My solution to this was to dump the CTE idea and instead use a temp table to get a row count and then return the data with paging, similar to what's below.
SELECT a,
b,
c,
ROW_NUMBER() OVER(ORDER BY c) AS RowNumber
INTO #table1
SET @TotalRows = COUNT(*)
FROM #table1
SELECT a,
b,
c
FROM #table1
WHERE RowNumber BETWEEN @RowStart AND @RowEnd
It works the way I want it to, it just doesn't perform that great. I would like to get it working with a CTE and then compare execution plans and IO and go with the one that performs better. If I could get it working as a CTE or some other alternative solution that I'm missing that would be great.
Thanks for any and all help.

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply

Lowell SSC Guru Points: 323522 More actions · Answer 1

i just sat in on a SQLSaturday #79 in Davie(Fort Lauderdale) presentation on how to improve performance in exactly this situation;

the Presenter was Dimitri, and he had a great example how he added indexes and CTE's untill the join of 20 primariy keys from the row number CTE was joined ont eh bigger(millionrowtable?) in order to get the best performance.

i looked on the sql saturday 79 page, but didn't find links to his presentation, nore in the schedule...i think his presentation was a late add on.

http://www.sqlsaturday.com/79/eventhome.aspx

here's an example using sys.objects, which i hope is understandable. the idea is cross join the total, and use a CTE to limit the rows to just stuff that exists in an index, before finally joining to get the rest of the data, so you get , say just 100 rows, isntead of a zillion rows that get limited to 100

--declare our variables

DECLARE @RowStart INT

DECLARE @RowEnd INT

SET @RowStart = 1

SET @RowEnd = 100

--get the total count

WITH TotalCount AS

(

SELECT COUNT(*) AS TotalCount FROM sys.objects objz

),

--for performance reasons, get the PK and the x number of rows we want to get.

--the ORDER BY is important: if we are Ordering by a differnet column, make sure there is an

--index that has the PK and the Ordering column

AllRows AS

(

SELECT

objz.OBJECT_ID,

ROW_NUMBER() OVER (ORDER BY objz.name ) AS RW

FROM sys.objects objz

)

--i think Dimitri had an additional CTE that just selected the rows in the row number between start and end...here i just grab them all

SELECT

AllRows.*,

objz.*,

TotalCount.*

FROM AllRows

INNER JOIN sys.objects objz

CROSS JOIN TotalCount

ON Allrows.object_id = objz.object_id

WHERE AllRows.RW BETWEEN @RowStart AND @RowEnd

the key is looking at the execution plan, so you can see the final join is just 100 rows, isntead of a lot more

Lowell

--help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

brendan woulfe Hall of Fame Points: 3888 More actions · Answer 2

Interesting solution. I'll have to try and wrap my head around that one. Not seeing how I would return the total count as an output parameter though? Wish I could see that presentation too, maybe they'll put it up later.

Cadavre SSC-Forever Points: 41690 More actions · Answer 3

Is there any problem with just using @@ROWCOUNT ?

DECLARE @RowStart INT

DECLARE @RowEnd INT

SET @RowStart = 1

SET @RowEnd = 100

;WITH CTE AS

(

SELECT *, ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS RowNumber

FROM sys.objects

)

SELECT * FROM CTE

WHERE RowNumber BETWEEN @RowStart AND @RowEnd

SELECT @@ROWCOUNT

--edit--

Obviously, use Lowell's solution for the rest of the query - if it adds any performance gains. I haven't seen it before which is why I haven't used it in the example above - want to get my head around it first!.

Forever trying to learn
My blog - http://www.cadavre.co.uk/
For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Eugene Elutin SSC Guru Points: 59322 More actions · Answer 4

Cadavre (8/16/2011)
Is there any problem with just using @@ROWCOUNT ?
DECLARE @RowStart INT
DECLARE @RowEnd INT
SET @RowStart = 1
SET @RowEnd = 100
;WITH CTE AS
(
SELECT *, ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS RowNumber
FROM sys.objects
)
SELECT * FROM CTE
WHERE RowNumber BETWEEN @RowStart AND @RowEnd
SELECT @@ROWCOUNT
--edit--
Obviously, use Lowell's solution for the rest of the query - if it adds any performance gains. I haven't seen it before which is why I haven't used it in the example above - want to get my head around it first!.

Just a small problem with using @@rowcount here...

It will not return what was required (total number of records before pagiong) but the number of records in the page.

In your example, if database have more than 50 user defined objects, instead of SELECT @@ROWCOUNT you could just do SELECT 100 :-D.

I would bodify offered solution to count total number of rows out of CTE, just to safe some code lines

--declare our variables

DECLARE @RowStart INT

DECLARE @RowEnd INT

SET @RowStart = 1

SET @RowEnd = 100

--get the records

;WITH Recs AS

(

SELECT *, ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) RN -- or whatever requried

FROM sys.objects

)

--Get the total OUT of full resultset

,Cnt AS ( SELECT COUNT(*) TtlCnt FROM Recs )

SELECT R.*, C.TtlCnt

FROM Recs R CROSS JOIN Cnt C

WHERE R.RN BETWEEN @RowStart AND @RowEnd

Also, I'm not sure if selecting just ID in CTE and joining it back would anyhow increase performance... Actually it's opposit! You will end up with more reads, as CTE is just a stylish wrapper for subquery. So, if you select ID's then join back to sys.objects SQL Server will perform much more reads then in my example. (~600 vs ~200).

_____________________________________________
"The only true wisdom is in knowing you know nothing"
"O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
(So many miracle inventions provided by MS to us...)

How to post your question to get the best and quick help[/url]

brendan woulfe Hall of Fame Points: 3888 More actions · Answer 5

Eugene has it right, it returns the number of records that were returned after paging. I need to know how many would have been returned had there been no paging.

I know you were just joking but I hardcoded those values (1 and 100), the calling application will send me a page number and a page size and then I have to calculate how many results to return. It's for a web search app so the user knows they are seeing results 1-100 of 1000, or 201-300 of 1000, or 1-5 of 10 etc. It really could be anything...just a background on the query in question.

brendan woulfe Hall of Fame Points: 3888 More actions · Answer 6

Eugene, so the count is returned as a column in the resultset? I already tried something similar and it wasn't an accepted solution...:(

I never understood why not, but I'm not a web developer, so they may have their reasons.

Cadavre SSC-Forever Points: 41690 More actions · Answer 7

Eugene Elutin (8/16/2011)
Just a small problem with using @@rowcount here...
It will not return what was required (total number of records before pagiong) but the number of records in the page.

Ah, misread the OP.

I guess to fit the multiple result-sets you'd need a temp table then.

bwoulfe (8/16/2011)
Eugene, so the count is returned as a column in the resultset? I already tried something similar and it wasn't an accepted solution...:(
I never understood why not, but I'm not a web developer, so they may have their reasons.

I had a .Net developer say the same to me awhile ago (although the requirements were slightly different - instead he wanted the whole result-set returned but wanted to know the size of the result-set). After we discussed it for awhile, we added a WCF component as a middle-man that obtained the entire result-set including a separate column that contained the number of records returned. The WCF component then passed this data to the front-end application before parcelling out the result-set in the way it was required.

Eugene Elutin (8/16/2011)
Also, I'm not sure if selecting just ID in CTE and joining it back would anyhow increase performance... Actually it's opposit! You will end up with more reads, as CTE is just a stylish wrapper for subquery. So, if you select ID's then join back to sys.objects SQL Server will perform much more reads then in my example. (~600 vs ~200).

I thought that would be the case, but haven't had time to test Lowell's code. Hoping that the presentation gets posted online somewhere, would be interested to see how it worked!

Forever trying to learn
My blog - http://www.cadavre.co.uk/
For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Eugene Elutin SSC Guru Points: 59322 More actions · Answer 8

How you want your count to be returned then?

Returning the count in the column would be the best option, client app can read this value from first row and then ignore this column completely.

You can return it as output parameter of stored proc or in a separate resultset. But then I whould advice to use temp table (example attached).

What you cannot do is: to return resultset of 100 rows but tell the client app that its recordscount is different

--declare our variables

DECLARE @RowStart INT

DECLARE @RowEnd INT

SET @RowStart = 1

SET @RowEnd = 100

--get the records

SELECT [object_id], ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) RN -- or whatever requried

INTO #Recs

FROM sys.objects

SELECT R.*

FROM Recs R

WHERE R.RN BETWEEN @RowStart AND @RowEnd

-- Option 1. return total count as stored proc return parameter

-- (you can do it via output parameter as well)

DECLARE @R INT

SELECT @R = COUNT(*) FROM #Recs

RETURN @R

-- Option 2. return total count in a separte resultset

SELECT COUNT(*) AS TotalRecCount FROM #Recs

And Yes, using temp table in this case is fine from performance perspective...

_____________________________________________
"The only true wisdom is in knowing you know nothing"
"O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
(So many miracle inventions provided by MS to us...)

How to post your question to get the best and quick help[/url]

brendan woulfe Hall of Fame Points: 3888 More actions · Answer 9

They want it returned with an OUTPUT parameter from the stored procedure. It's setup with temp table right now and the performance is just ok. I may just need to stick with a temp table but rewrite it. The stored procedure is over 1000 lines of code so I've been dreading doing that.

Thanks for the responses everyone, I think it's back to the drawing board though. I'm going to try the count returned in a column solution with them again, see if that might work this time....:-P

Eugene Elutin SSC Guru Points: 59322 More actions · Answer 10

bwoulfe (8/16/2011)
...It's setup with temp table right now and the performance is just ok. I may just need to stick with a temp table but rewrite it...

Using temp tables is fine and may be even more beneficial then CTE sometimes.

Just few tips for rewritting:

1. Use SELECT ... INTO to create and populate temp table

2. Consider creating appropriate indexes on temp tables if they used multiple times in joins.

_____________________________________________
"The only true wisdom is in knowing you know nothing"
"O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
(So many miracle inventions provided by MS to us...)

How to post your question to get the best and quick help[/url]

brendan woulfe Hall of Fame Points: 3888 More actions · Answer 11

Eugene Elutin (8/16/2011)
bwoulfe (8/16/2011)
...It's setup with temp table right now and the performance is just ok. I may just need to stick with a temp table but rewrite it...
Using temp tables is fine and may be even more beneficial then CTE sometimes.
Just few tips for rewritting:
1. Use SELECT ... INTO to create and populate temp table
2. Consider creating appropriate indexes on temp tables if they used multiple times in joins.

I'm using INSERT INTO, just so I can see my column names and reference them later if a problem occurs (this is still in dev). Any reason SELECT...INTO is preferred?

I do all the inserts, then add my indexes, then return the results. There are about 4 or 5 inserts into the temp table though, maybe reordering those steps might help?

Eugene Elutin SSC Guru Points: 59322 More actions · Answer 12

SELECT Col1 AS Column1

,Col2 AS Column2

.....

INTO #t

You can see that you can name your columns quite easely. You can also define required datatype by casting the value to it (eg. SELECT CAST(SomeInt AS VARCHAR(11)) AS ...)

Benefits in performance...

You can read quite a lot about it

Just few links you may find interesting:

http://stackoverflow.com/questions/1302670/sql-server-select-into-and-blocking-with-temp-tables

http://msdn.microsoft.com/en-us/library/dd425070(v=sql.100).aspx

Basically it's one of the "bulk load" methods within SQL Server so gives you the fasterst possible insert speed in T-SQL.

If data you inserting is naturally distinct you can insert all of your data at once using UNION ALL statement, something like:

SELECT q1.col1, q1.Col2, ...

INTO #t

FROM q1

UNION ALL

SELECT q2.col1, q2.Col2, ...

FROM q2

UNION ALL

SELECT q3.col1, q3.Col2, ...

FROM q3

....

If needs to be deduplicated any way you may test if using UNION alone will be good enough.

_____________________________________________
"The only true wisdom is in knowing you know nothing"
"O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
(So many miracle inventions provided by MS to us...)

How to post your question to get the best and quick help[/url]

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 13

Lowell (8/15/2011)
i just sat in on a SQLSaturday #79 in Davie(Fort Lauderdale) presentation on how to improve performance in exactly this situation;
the Presenter was Dimitri, and he had a great example how he added indexes and CTE's untill the join of 20 primariy keys from the row number CTE was joined ont eh bigger(millionrowtable?) in order to get the best performance.

Was it the key seek method, explained elegantly by Paul White here[/url]?

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Lowell SSC Guru Points: 323522 More actions · Answer 14

ChrisM@Work (8/16/2011)
Was it the key seek method, explained elegantly by Paul White here[/url]?

that post from Paul exactly encapsulates that portion of his presentation Chris, yes.

» Dmitri Korotkevitch's web site, now that i grabbed my notes, is http://aboutsqlserver.com but he has not posted his presentation materials from it yet.

Lowell

--help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

Returning row count as output in CTE

Cookies on SQLServerCentral