Pros and cons of six SQL table tools

  • Comments posted to this topic are about the item Pros and cons of six SQL table tools

    Andy is a director of Wise Owl[/url], a UK company providing training courses (and occasional consultancy) in SQL, Reporting Services, Integration Services and Analysis Services, as well as in many other Microsoft software applications. You can see more about Wise Owl's SQL Server training courses here[/url].

  • Good article on showing that there's more than one way to accomplish the same thing.

    I do have a couple of comments though...

    It's a shame that you didn't include any actual performance data especially since performance is one of the biggest "Pros and Cons" that folks working with "big data" have a concern about. Simply mentioning any hint of performance or a performance problem as "possibly" will give some people an incorrect impression and they may avoid a high speed method because someone said "possibly not as fast...". As we both know, "It Depends" and Table Variables and Temp Tables take turns blowing the doors off of each other depending on what is being done. I'd stick to facts meaning that if you don't have code to prove something one way or the other in the article, I'd leave the conjecture out of the article.

    The other thing that I want to cite is based on another bit of innuendo included in the article. You say that table variables are "memory only" and not only is that the furthest thing from the truth, but it also implies that Temp Tables are "disk only". In fact, while it's true that both Temp Tables and Table Variables are spawned in TempDB, it's also true that they both start out in memory and only use the disk when they won't fit in memory for one reason or another.

    Other than that, this is a good article that shows different methods of doing the same thing. The note on the use of iTVFs as a good replacement for Views and Stored Procedures is a good one that more people should take advantage of. I'll add that iTVFs (Inline Table Valued Functions) make for a great "parameterized View".

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I enjoyed the article and think it's a good overall non-biases intro. There are a couple of points I feel are worth mentioning. The in-line TVF can be consumed and optimized by the query processor while the multi-statement variety cannot. The other point is an observation I've made about CTE vs derived tables. If the CTE is simple (nothing more than a simple select) then you need to weigh the pro/con for the entire plan. By breaking it into essentially two queries, you end up with two plans that may be better when put together. I've run across situations where a CTE was defined and populated with the "driver" records for the subsequent DML statement and when asked why they chose this approach it's frequently that they wanted to try out CTEs. A little over half of the time I get better performance by using a derived table. I believe this is because the query optimizer gets the entire workload and can choose a better plan than if the two statements are run independently. As always, your mileage will vary, but if you're trying to get the best performance you should check out all valid approaches.

    --Paul Hunter

  • I enjoyed the look at the various method of data access you compared. Thanks for taking the time to put this together.

    I do want to comment on your assessment that derived tables and common table expressions are "essentially the same thing." While it may appear so, these two solutions are very different in functionality. A derived table can be used once only in an outer query, while a CTE can be used multiple times and can be used recursively, providing significantly more functionality. For example, the query could return not only the names but a subtotal count of the number of females by role:

    ;WITHFemales

    AS(

    SELECTDirectorName AS PersonName,

    'Director' AS Job,

    DirectorDob AS Dob

    FROMtblDirector

    WHEREDirectorGender = 'Female'

    UNION

    SELECTActorName AS PersonName,

    'Actor' AS Job,

    ActorDob AS Dob

    FROMtblActor

    WHEREActorGender = 'Female'

    )

    SELECTJob, PersonName

    FROM(

    SELECTJob, 1 AS SortSeq, PersonName

    FROMFemales

    UNION

    SELECT'Total for ' + Job AS Job, 2 AS SortSeq, CAST(COUNT(PersonName) AS VARCHAR(10)) AS PersonName

    FROMFemales

    GROUP BY 'Total for ' + Job

    ) x

    ORDER BY Job, SortSeq, PersonName;

    (I'm not at a SQL Server window at the moment, so I have not tested the code above.)

    Please notice that the "Females" CTE has been used twice in the outer query, once to list the details and once to provide summary records by job description, something that (to my knowledge...) is not possible with a derived table.

  • fahey.jonathan (12/26/2011)


    Please notice that the "Females" CTE has been used twice in the outer query, once to list the details and once to provide summary records by job description, something that (to my knowledge...) is not possible with a derived table.

    That statement is absolutely correct. However, just like "calling" a view twice in the same query, the CTE that is called twice will also execute twice. You can see that in the execution plan. That might be something to consider in planning for performance. 🙂

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (12/26/2011)


    fahey.jonathan (12/26/2011)


    Please notice that the "Females" CTE has been used twice in the outer query, once to list the details and once to provide summary records by job description, something that (to my knowledge...) is not possible with a derived table.

    That statement is absolutely correct. However, just like "calling" a view twice in the same query, the CTE that is called twice will also execute twice. You can see that in the execution plan. That might be something to consider in planning for performance. 🙂

    Good thing to keep in mind! Thanks.

  • I was surprised and disappointed to see that the inability to use SELECT... INTO was a disadvantage.

    Anything which takes that option away from developers is a good thing IMHO!

    I have a rule for Table Variables which I am pretty strict about. If the developer can't tell me exactly how many records (to within 10) and that number is less than 1000 (reasonably small) records, the it should be a temp table.

    You also missed one big advantage of a temp table. You can index it. You need to do this a bit carefully, because index names are unique, so if more than one process is going to run generating this temp table, then you need to add a "uniqueifier" to the index name (such as @@SPID).

  • Toby Harman (1/2/2012)


    I was surprised and disappointed to see that the inability to use SELECT... INTO was a disadvantage.

    Anything which takes that option away from developers is a good thing IMHO!

    I have a rule for Table Variables which I am pretty strict about. If the developer can't tell me exactly how many records (to within 10) and that number is less than 1000 (reasonably small) records, the it should be a temp table.

    You also missed one big advantage of a temp table. You can index it. You need to do this a bit carefully, because index names are unique, so if more than one process is going to run generating this temp table, then you need to add a "uniqueifier" to the index name (such as @@SPID).

    Not quite. Index names do not need to be unique. Constraint names do. If you let the system build the constraint names on temp tables, you'll never have a conflict.

    Also, through the use of constraints built at creation time, you can, in fact, have indexes on table variables.

    Not that it'll matter to anyone, but the only places where are I use table variables is when I need for something to be "Rollback Proof" or I need a table structure in a function.

    So far as SELECT/INTO goes, I've found it to be quite handy both in the area of performance and keeping reasonable log sizes especially in TempDB. I'm not sure why so many people are against the use of such a thing.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • The fact that it places a schema lock on the tempdb for the duration of the query is my issue. If the query is fast enough, this may not be an issue, but when someone starts moving 100,000 records into a temp-table using this on a system with thousands of users, I suspect the issue may become more relevant!

    I suspect I'm overly pessimistic about this, but that reflects my life experience! All too often I've inherited someone else's coding techniques!

    I stand corrected about the constraint names! The first index I add is generally clustered (and thus a constraint!), and these do need the "uniqueifier" to be multi-user.

  • I think you meant "PK", not clustered index which can, of course, be different.

    Absolutely understood on the concerns over SELECT/INTO. I've build some huge tables with it over many minutes. So long as it's on the same server, I've not run into any problems even on busy servers. I definitely wouldn't use it across linked servers because it will sometimes "freeze" the source DB (and it was truly embarassing when that happened to me).

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply