Paging data with CTE

  • Hi,

    I use CTE with ROW_NUMBER() OVER (...) to page data from tables...

    I was convinced that the expression within the CTE should be as "small" as possible, by using only the fields needed for sorting and filtering.

    In this example the Employees table has 10.000 rows and the CTE query returns 71 and I want from 10 to 19:

    ;WITH emps AS

    (

    SELECT empid, empname, ROW_NUMBER() OVER (ORDER BY empname) row FROM Employees WHERE empname like 'Emp%' and salary < 100000

    )

    SELECT * FROM emps WHERE row BETWEEN 10 AND 19

    PRINT '----------- NEXT -----------'

    ;WITH emps AS

    (

    SELECT empid, ROW_NUMBER() OVER (ORDER BY empname) row FROM Employees WHERE empname like 'Emp%' and salary < 10000

    )

    SELECT r.empid, r.empname FROM emps t INNER JOIN Employees r ON t.empid = r.empid WHERE row BETWEEN 10 AND 19

    There's an index on empname and salary and the PK over empid.

    On the 1st query I get all the data I want to output right from the CTE and on the 2nd I only get the data necessary to filter from the CTE and then join with the tables I want the data from...

    I thought that the 1st query was much "heavier" since I get 985 rows with all the data and the I only want 10, and the 2nd get 985 rows only with the Id...

    But not...

    SQL Server parse and compile time:

    CPU time = 0 ms, elapsed time = 0 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 0 ms.

    SQL Server parse and compile time:

    CPU time = 0 ms, elapsed time = 18 ms.

    (10 row(s) affected)

    Table 'Employees'. Scan count 1, logical reads 5, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    (1 row(s) affected)

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 117 ms.

    ----------- NEXT -----------

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 0 ms.

    (10 row(s) affected)

    Table 'Employees'. Scan count 1, logical reads 39, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    (1 row(s) affected)

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 187 ms.

    SQL Server parse and compile time:

    CPU time = 0 ms, elapsed time = 0 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 0 ms.

    The 2nd query has more reads and takes longer..

    The execution plan says the 1st takes 10% and the 2nd 90%.... Why?!

    If I had more tables to get data from would the results be as I expected?

    Thanks,

    Pedro

    PS: after execution a few times the 2nd query takes less time, 3ms less than the 1st one.



    If you need to work better, try working less...

  • Post the actual execution plans as .sqlplan attachments. It's difficult to determine anything useful from these figures.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • ChrisM@Work (10/4/2012)


    Post the actual execution plans as .sqlplan attachments. It's difficult to determine anything useful from these figures.

    Thanks... here it is.

    Pedro



    If you need to work better, try working less...

  • PiMané (10/4/2012)


    ChrisM@Work (10/4/2012)


    Post the actual execution plans as .sqlplan attachments. It's difficult to determine anything useful from these figures.

    Thanks... here it is.

    Pedro

    The plans are almost the same except for the clustered index seek due to the join with the CTE by empid...

    But my main question is would these results still be true if there was a join with other tables or had a blob as output?!

    I added a IMAGE field and put it as output on both queries and the 1st took 48% and the 2nd 52%, but while the 2nd kept the index seeks the 1st changed to clustered scan (since the IMAGE field wasn't "COVERED").

    In terms of time the 2nd got a little faster than the 1st.. but every time I run the queries it changes.. from -20ms to +30ms.. not a steady and reliable value...

    I could create an index to cover IMAGE field but if another field was added we'd be changing indexes all day....

    Thanks,

    Pedro



    If you need to work better, try working less...

  • PiMané (10/4/2012)


    PiMané (10/4/2012)


    ChrisM@Work (10/4/2012)


    Post the actual execution plans as .sqlplan attachments. It's difficult to determine anything useful from these figures.

    Thanks... here it is.

    Pedro

    The plans are almost the same except for the clustered index seek due to the join with the CTE by empid...

    But my main question is would these results still be true if there was a join with other tables or had a blob as output?!

    I added a IMAGE field and put it as output on both queries and the 1st took 48% and the 2nd 52%, but while the 2nd kept the index seeks the 1st changed to clustered scan (since the IMAGE field wasn't "COVERED").

    In terms of time the 2nd got a little faster than the 1st.. but every time I run the queries it changes.. from -20ms to +30ms.. not a steady and reliable value...

    I could create an index to cover IMAGE field but if another field was added we'd be changing indexes all day....

    Thanks,

    Pedro

    Why, in the second query, do you join the employees table to the CTE? I can't see any reason for it - unless there are dupes on EmpID.

    Why are you surprised that the query is more expensive?

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • ChrisM@Work (10/4/2012)


    Why, in the second query, do you join the employees table to the CTE? I can't see any reason for it - unless there are dupes on EmpID.

    Why are you surprised that the query is more expensive?

    Because I just want to get the keys that match the conditions so I can get the other data...

    If instead of 1 table I had 20 fields from 5 tables and only one to get the keys from, for example employees was the one with keys but add addresses, orders (count of orders the employee inserted), .... probably the 2nd way would be faster... right?!

    Or SQL is "smart" enough to get only the keys from the paged records and only then get the other data (1st execute the rows filter and only after the joins...)?

    Thanks,

    Pedro



    If you need to work better, try working less...

  • PiMané (10/4/2012)


    ChrisM@Work (10/4/2012)


    Why, in the second query, do you join the employees table to the CTE? I can't see any reason for it - unless there are dupes on EmpID.

    Why are you surprised that the query is more expensive?

    Because I just want to get the keys that match the conditions so I can get the other data...

    If instead of 1 table I had 20 fields from 5 tables and only one to get the keys from, for example employees was the one with keys but add addresses, orders (count of orders the employee inserted), .... probably the 2nd way would be faster... right?!

    Or SQL is "smart" enough to get only the keys from the paged records and only then get the other data (1st execute the rows filter and only after the joins...)?

    Thanks,

    Pedro

    SSC has just the article for you, right here[/url]. No need to redesign the wheel just yet 😉

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • ChrisM@Work (10/4/2012)


    SSC has just the article for you, right here[/url]. No need to redesign the wheel just yet 😉

    My initial thoughts are right then 🙂

    Thx,

    Pedro



    If you need to work better, try working less...

  • PiMané (10/4/2012)


    ChrisM@Work (10/4/2012)


    SSC has just the article for you, right here[/url]. No need to redesign the wheel just yet 😉

    My initial thoughts are right then 🙂

    Thx,

    Pedro

    Reckon so 😉

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • ChrisM@Work (10/4/2012)


    PiMané (10/4/2012)


    ChrisM@Work (10/4/2012)


    SSC has just the article for you, right here[/url]. No need to redesign the wheel just yet 😉

    My initial thoughts are right then 🙂

    Thx,

    Pedro

    Reckon so 😉

    Thanks for the enlightenment and help 🙂

    Pedro



    If you need to work better, try working less...

Viewing 10 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply