A Case FOR Cursors...

  • okay cool, Ill have to change my template for stored procs around then to use that format

  • cursors are there to be used for admin scripts to be executed manually

    I like this caveat, especially as it now excuses my one use I have for looping in TSQL: populating my date dimension. (I also used it for my time dimension, but once that is filled, it never needs to be filled again.)

  • RonKyle (6/5/2015)


    cursors are there to be used for admin scripts to be executed manually

    I like this caveat, especially as it now excuses my one use I have for looping in TSQL: populating my date dimension. (I also used it for my time dimension, but once that is filled, it never needs to be filled again.)

    Populating a date dimension can be done using a tally table. No loop necessary.

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • Alan.B (6/5/2015)


    RonKyle (6/5/2015)


    cursors are there to be used for admin scripts to be executed manually

    I like this caveat, especially as it now excuses my one use I have for looping in TSQL: populating my date dimension. (I also used it for my time dimension, but once that is filled, it never needs to be filled again.)

    Populating a date dimension can be done using a tally table. No loop necessary.

    I think this raises an interesting point. If I'm writing a quick script for one-time use, and I'm in a hurry, I forgive myself a lot of inefficiencies. On the other hand... if you have the time, this is your chance to practice doing things the right way. Using certain patterns a few times will make it easier to repeat them without turning to documentation or the internet when you are in a hurry.

    So, I wouldn't necessarily criticize someone or say "don't ever use a loop to populate a time dimension table," but I would say "hey, give it a try using a different method if you have a chance, it'll make you a better T-SQL coder"!

    I'm really appreciating this discussion because, although I'm more willing to defend cursor use than others, it's a reminder that I still have patterns I need to make second nature.

  • Brian J. Parker (6/5/2015)


    Alan.B (6/5/2015)


    RonKyle (6/5/2015)


    cursors are there to be used for admin scripts to be executed manually

    I like this caveat, especially as it now excuses my one use I have for looping in TSQL: populating my date dimension. (I also used it for my time dimension, but once that is filled, it never needs to be filled again.)

    Populating a date dimension can be done using a tally table. No loop necessary.

    I think this raises an interesting point. If I'm writing a quick script for one-time use, and I'm in a hurry, I forgive myself a lot of inefficiencies. On the other hand... if you have the time, this is your chance to practice doing things the right way. Using certain patterns a few times will make it easier to repeat them without turning to documentation or the internet when you are in a hurry.

    So, I wouldn't necessarily criticize someone or say "don't ever use a loop to populate a time dimension table," but I would say "hey, give it a try using a different method if you have a chance, it'll make you a better T-SQL coder"!

    I'm really appreciating this discussion because, although I'm more willing to defend cursor use than others, it's a reminder that I still have patterns I need to make second nature.

    Yeah doing something like using a tally table to populate a date dimension is something that seems a bit obtuse the first time. The first time you do something like with a tally table though and the concept sinks in you just automatically go to the tally table next time instead of a loop. In this example the benefits are two fold. The code executes faster but it is also a LOT simpler to write than a loop. The third possible benefit is that it starts moving the developers thinking into sets. The challenge is in thinking about what you want to do to a column instead of what you want to do to each row. When you automatically think about the columns you are well on your way. 😛

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Sean Lange (6/5/2015)


    Brian J. Parker (6/5/2015)


    Alan.B (6/5/2015)


    RonKyle (6/5/2015)


    cursors are there to be used for admin scripts to be executed manually

    I like this caveat, especially as it now excuses my one use I have for looping in TSQL: populating my date dimension. (I also used it for my time dimension, but once that is filled, it never needs to be filled again.)

    Populating a date dimension can be done using a tally table. No loop necessary.

    I think this raises an interesting point. If I'm writing a quick script for one-time use, and I'm in a hurry, I forgive myself a lot of inefficiencies. On the other hand... if you have the time, this is your chance to practice doing things the right way. Using certain patterns a few times will make it easier to repeat them without turning to documentation or the internet when you are in a hurry.

    So, I wouldn't necessarily criticize someone or say "don't ever use a loop to populate a time dimension table," but I would say "hey, give it a try using a different method if you have a chance, it'll make you a better T-SQL coder"!

    I'm really appreciating this discussion because, although I'm more willing to defend cursor use than others, it's a reminder that I still have patterns I need to make second nature.

    Yeah doing something like using a tally table to populate a date dimension is something that seems a bit obtuse the first time. The first time you do something like with a tally table though and the concept sinks in you just automatically go to the tally table next time instead of a loop. In this example the benefits are two fold. The code executes faster but it is also a LOT simpler to write than a loop. The third possible benefit is that it starts moving the developers thinking into sets. The challenge is in thinking about what you want to do to a column instead of what you want to do to each row. When you automatically think about the columns you are well on your way. 😛

    Populating a date table with a Tally table is exactly the way I leaned how to use them and became comfortable with them. They can be incredibly useful, especially for text parsing.

    --------------------------------------
    When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions.
    --------------------------------------
    It’s unpleasantly like being drunk.
    What’s so unpleasant about being drunk?
    You ask a glass of water. -- Douglas Adams

  • Populating a date dimension can be done using a tally table. No loop necessary.

    The code executes faster but it is also a LOT simpler to write than a loop.

    The loop is a very straightforward straightfoward to write and only takes several seconds to enter years worth of dates. I actually populate most of the columns using a set based TSQL, but can't avoid it for entering the actual date itself for the set-based queries to have something to work with. My one comment to your comment is that you don't actually explain what a tally table is or how it solves the issue. As I'm always interested in learning new techniques, please feel free to explain. Thanks!

  • RonKyle (6/5/2015)


    Populating a date dimension can be done using a tally table. No loop necessary.

    The code executes faster but it is also a LOT simpler to write than a loop.

    The loop is a very straightforward straightfoward to write and only takes several seconds to enter years worth of dates. I actually populate most of the columns using a set based TSQL, but can't avoid it for entering the actual date itself for the set-based queries to have something to work with. My one comment to your comment is that you don't actually explain what a tally table is or how it solves the issue. As I'm always interested in learning new techniques, please feel free to explain. Thanks!

    I would be happy to explain it but our good friend Jeff Moden already has a spectacular article covering this very topic. http://www.sqlservercentral.com/articles/T-SQL/62867/[/url]

    The tally table has been dubbed "the swiss army knife of t-sql" (I am kicking myself I can't remember who but I am thinking Gianluca). Once you get the concept the places where you can use it are amazing.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • RonKyle (6/5/2015)


    Populating a date dimension can be done using a tally table. No loop necessary.

    The code executes faster but it is also a LOT simpler to write than a loop.

    The loop is a very straightforward straightfoward to write and only takes several seconds to enter years worth of dates. I actually populate most of the columns using a set based TSQL, but can't avoid it for entering the actual date itself for the set-based queries to have something to work with. My one comment to your comment is that you don't actually explain what a tally table is or how it solves the issue. As I'm always interested in learning new techniques, please feel free to explain. Thanks!

    A Tally table can frequently do the same thing in under a second.

    Here's an article I wrote about fiscal date population. At the beginning of the article is a link to an article by Jeff Moden that explains everything wonderfully.

    My Article

    Jeff's fantastic article

    This is one I wrote on using a tally table to find hidden characters

    --------------------------------------
    When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions.
    --------------------------------------
    It’s unpleasantly like being drunk.
    What’s so unpleasant about being drunk?
    You ask a glass of water. -- Douglas Adams

  • None of the links worked for me. What am I not doing correctly?

  • Not sure why the previous links were funky...something weird is going on....

    --EDIT--

    The links had a couple invalid characters.

    Sioban's Article[/url]

    Jeff's fantastic article[/url]

    This is one Sioban wrote on using a tally table to find hidden characters[/url]

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Sean Lange (6/5/2015)


    Not sure why the previous links were funky...something weird is going on....

    --EDIT--

    The links had a couple invalid characters.

    Sioban's Article[/url]

    Jeff's fantastic article[/url]

    This is one Sioban wrote on using a tally table to find hidden characters[/url]

    Huh, thanks for catching that.

    --------------------------------------
    When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions.
    --------------------------------------
    It’s unpleasantly like being drunk.
    What’s so unpleasant about being drunk?
    You ask a glass of water. -- Douglas Adams

  • No argument with the statement that cursors are acceptable where row by row processing can't be avoided. The reason relative newcomers to SQL are warned away from them is that they are often used as a crutch by developers who are used to procedural (row by row) thinking. This enables them to right functional, but poorly performing code, rather than understanding set-based processing.

    I remember one thread that persisted for over a week, with the author arguing that certain things could not possibly be done in a set based manner, until he was shown working solutions that did the "impossible", from people who actually understood the concept.

    Row by row processing has its place in SQL. But that niche is much smaller than many people realize.

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • RonKyle (6/5/2015)


    Populating a date dimension can be done using a tally table. No loop necessary.

    The code executes faster but it is also a LOT simpler to write than a loop.

    The loop is a very straightforward straightfoward to write and only takes several seconds to enter years worth of dates. I actually populate most of the columns using a set based TSQL, but can't avoid it for entering the actual date itself for the set-based queries to have something to work with. My one comment to your comment is that you don't actually explain what a tally table is or how it solves the issue. As I'm always interested in learning new techniques, please feel free to explain. Thanks!

    About 3 years ago I had never heard of a "numbers" or "tally table" and remember Jeff Moden saying on one of these forums, "learn how to use a tally table, it will change your life". That intrigued me. I read his article and a few years later can say that it did change my life. Not a month goes by since that I have not used a tally table to make a query run several times faster (100's or 1000's of times faster in some cases). I have used other set-based methods I have learned since to accomplish the same but for now let's focus on using a Tally table to populate a date dimension table.

    BTW, I don't know about simpler (a loop is as straight-forward as it gets) and whatever method you use it only takes a few seconds max to populate a typical dimdate table because we're not talking about millions or even 100's of thousands of rows. That said, this is a good example of how learning to use a tally table will make you a better developer.

    Below is code I threw together real quick. We're just going to focus on one column and we'll do hours (not common but will suffice for this example). We're going to generate 999,288 rows of data by getting the date, by hour, beginning in 1982 through 2020. The first way using a tally table, the second using a loop.

    SET NOCOUNT ON;

    IF OBJECT_ID('tempdb..#dimdate') IS NOT NULL DROP TABLE #dimdate;

    CREATE TABLE #dimdate(DateTxt datetime NOT NULL);

    GO

    PRINT '-======= using set-based method: ========'

    DECLARE @startTime datetime = getdate();

    DECLARE

    @StartDate datetime = '1/1/1982',

    @EndDate datetime = '1/1/2020';

    WITH

    E1(N) AS (SELECT 1 FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) t(N)),

    E3(N) AS (SELECT 1 FROM E1 a, E1 b, E1 c),

    Nums(N) AS (SELECT 0 UNION ALL SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 1)) FROM E3 a, E3 b)

    INSERT #dimdate

    SELECT TOP(DATEDIFF(HOUR,@StartDate,@EndDate))

    DateTxt = DATEADD(HOUR,N,@StartDate)

    FROM Nums;

    PRINT DATEDIFF(MS,@StartTime,getdate());

    GO 3

    IF OBJECT_ID('tempdb..#dimdate2') IS NOT NULL DROP TABLE #dimdate2;

    CREATE TABLE #dimdate2(DateTxt datetime NOT NULL);

    GO

    PRINT '-======= using a loop: ========'

    DECLARE @startTime datetime = getdate();

    DECLARE

    @StartDate datetime = '1/1/1982',

    @EndDate datetime = '1/1/2020',

    @i int = 1;

    WHILE @i <= DATEDIFF(HOUR,@StartDate, @EndDate)

    BEGIN

    SET @startDate = DATEADD(HOUR,@i,@StartDate);

    INSERT #dimdate2 VALUES(@StartDate);

    END

    PRINT DATEDIFF(MS,@StartTime,getdate());

    GO 3

    The test runs three times for each. The results:

    Beginning execution loop

    -======= using set-based method: ========

    350

    -======= using set-based method: ========

    303

    -======= using set-based method: ========

    293

    Batch execution completed 3 times.

    Beginning execution loop

    -======= using a loop: ========

    2413

    -======= using a loop: ========

    2426

    -======= using a loop: ========

    2420

    Batch execution completed 3 times.

    As you can see the tally table method is ~8.5 times faster.

    This is simple example for one column. You can populate all columns of using a combination of DATEADD, DATEPART, DATENAME, YEAR, MONTH, WEEKDAY, etc... and perhaps some string functions and/or simple math depending on your needs. For a dimdate table, who cares. But when you have a function or sproc that runs frequently against millions of rows the results are far-more staggering.

    As Sean said, this article "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] does a fantastic job explaining what a numbers table or Tally table is. It's a must read.

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • Why do you need the hour? Is that necessary to populate the date when it's just the date? I'm intrigued by this method, but not sure it addresses just adding a date only. (I wouldn't ever mix my date and time dimensions for lots of reasons). Unfornately I can't test this until I get home, but am trying to understand it. I agree that first the most part the remaining fields can be populated using date time functions. There are some exceptions of course, such as if the day is a holiday.

Viewing 15 posts - 106 through 120 (of 215 total)

You must be logged in to reply to this topic. Login to reply