Create a Tally Function (fnTally)

  • Jeff Moden wrote:

    I also consider that to be mostly a "marketing for Microsoft" site.  Consider the number of hits the bloody community put on the "dark theme" request in SSMS and how few hits there are on the request to fix the broken STRING_SPLIT() function or the 12 years it's been for a high performance machine language level fnTally() function to be built into T-SQL like it has been in much cheaper versions of RDBMS software almost since their inception 1 to 2 decades ago.

    And then there are "joys" like the bloody performance issues built into the FORMAT() function and the fact that the newer temporal datatypes like DATE, TIME, DATETIME2 are no longer ANSI compliant when it comes to date math.

    None of that will ever be fixed because it's obviously much more important to come of with things like the "dark theme" than it is to fix partially broken or limited usage stuff.  Heh... and when they do, it's not actually a fix but some bloody workaround like DATEDIFF_BIG or the current rendition of that damned PIVOT operator (which is MUCH better in MS Access!).

    It's OK, Jeff! Don't hold back! Tell Microsoft how you really feel! 😀

    I also would love to see "a high performance machine language level fnTally() function to be built into T-SQL like it has been in much cheaper versions of RDBMS software almost since their inception 1 to 2 decades ago." Postgres has a pretty nifty (and fast!!) generate_series() function for multiple data types.

  • The Postgres function is the one that I remember the best.  I've not tested it for performance because I've never had to work on a Postgres DB.  Oddly enough, I have been able to help some folks out in Postgres just by reading the documentation on the language (there are quite a few differences in things like the temporal functions).  I've never actually written even a single line of code in it myself. 😀

    Heh... as for holding back... I actually AM holding back!  😀

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I considered submitting a request asking for a way to suppress the dark mode request.

    I actually have to use DATEDIFF_BIG.  The fact they couldn't be bothered to implement the corresponding DATEADD_BIG is frustrating.  I wonder why they couldn't have just implemented implicit type conversion to bigint in the existing functions.

     

  • You can ab

    ratbak wrote:

    I considered submitting a request asking for a way to suppress the dark mode request.

    I actually have to use DATEDIFF_BIG.  The fact they couldn't be bothered to implement the corresponding DATEADD_BIG is frustrating.  I wonder why they couldn't have just implemented implicit type conversion to bigint in the existing functions.

    You can easily avoid the need for DATEDIFF_BIG.  The following article shows how to do that, which also demonstrates the inadequacies of the newer temporal datatypes.  MS did us no favors there.

    https://www.sqlservercentral.com/articles/calculating-duration-using-datetime-start-and-end-dates-sql-spackle-2

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • ratbak wrote:

    I considered submitting a request asking for a way to suppress the dark mode request.

    I actually have to use DATEDIFF_BIG.  The fact they couldn't be bothered to implement the corresponding DATEADD_BIG is frustrating.  I wonder why they couldn't have just implemented implicit type conversion to bigint in the existing functions.

    Looking back at this, I finally found a decent use for DATEDIFF_BIG... it's to create UNIX TimeStamps of both the "seconds since 1970" and "milliseconds since 1970" flavor (the later of which is necessarily a BIGINT, which also fixes the "2038" issue.

    As you say, it's absolutely fascinating to me that they didn't come out with a DATEADD_BIG but what do you expect from MS (see my previous rant, as well) when they do things like come out with an EOMONTH function but not an FOMONTH function?

    And, yeah... glad they finally came out with a sequence generator and they finally fixed the huge oversight they made with String_Split().  I hope both are actually fast when 2022 RTM hits the streets.  I'm generally not an early adopter of their stuff, especially after what I saw happen in virtually every release they've had since 2000 SP3 and so I've not even thought about any of the pre-release stuff they came out with... They don't pay me enough to be a member of the "QA" Team .

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Looking back at this, I finally found a decent use for DATEDIFF_BIG... it's to create UNIX TimeStamps of both the "seconds since 1970" and "milliseconds since 1970" flavor (the later of which is necessarily a BIGINT, which also fixes the "2038" issue.

    Yes, that's exactly my use case.

  • With a minor change to the function you could increase the functionality so that instead of specifying a bit for Zero or One for the starting value you could specify a Start N. This would make it easier to call in some use cases:

    ALTER FUNCTION [dbo].[fnTally]
    (@StartN BIGINT, @MaxN BIGINT)
    RETURNS TABLE WITH SCHEMABINDING AS
    RETURN WITH
    H2(N) AS ( SELECT 1
    FROM (VALUES
    (1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
    ,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
    ,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
    ,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
    ,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
    ,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
    ,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
    ,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
    ,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
    ,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
    ,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
    ,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
    ,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
    ,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
    ,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
    ,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
    )V(N)) --16^2 or 256 rows
    , H4(N) AS (SELECT 1 FROM H2 a, H2 b) --16^4 or 65,536 rows
    , H8(N) AS (SELECT 1 FROM H4 a, H4 b) --16^8 or 4,294,967,296 rows
    SELECT TOP(@MaxN - @StartN + 1)
    N = ROW_NUMBER() OVER (ORDER BY N) + @StartN - 1
    FROM H8
    ;

     

  • Well, if you call this fnTally with a start value of zero (0), you get two zeros (0). You can eliminate:

    SELECT N = 0 WHERE @StartN = 0 UNION ALL

    from the code.

     

     

  • Lynn Pettis wrote:

    Well, if you call this fnTally with a start value of zero (0), you get two zeros (0). You can eliminate:

    SELECT N = 0 WHERE @StartN = 0 UNION ALL

    from the code.

    Thanks, yes, I made some changes but then wanted it to look as much like the original function as possible so I put my changes in the original and I forgot to delete that bit. I'll edit it now.

  • Thanks, Jonathan.

    A long time ago, it used to make quite a difference performance wise.  Not so much on singleton big stuff like so many people test with but on the little stuff against a fairly large number of rows.  Since I need to start at "0" about 99% of the time and start at "1" the rest of the time, I didn't need to "punish" the function with extra work.

    I just did a quick "what if" test and it doesn't appear to make a difference anymore but it deserves a larger test with smaller stuff.  Who knows?  It would be pretty cool if it actually came out to be faster on today's procs.

    It's also a good reminder that "things change" and so thank you for that, as well.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • What about making datatype specific fnTally-ies?  Many times the maximum requirement is less than BIGINT.  What about fnTallySmalldatetime, fnTallyDatetime?  What about making separate 0+ version and 1+ versions?

    Any editorial regarding the new GENERATE_SERIES function?  I have an early impression which hopefully is incorrect.  We'll see.  Regarding syntax I believe (for the first time afaik) explicit parameter labels are required.  In the following the parameter labels START=, STOP=, and STEP= (if you use it) are required

    DECLARE @strt decimal(2, 1) = 0.0,
    @stop decimal(2, 1) = 1.0,
    @step decimal(2, 1) = 0.1;

    SELECT [value]
    FROM GENERATE_SERIES(START = @strt, STOP = @stop, STEP = @step);

    Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

  • I've found that changing datatypes in a Tally function really isn't necessary but that's a personal observation.  Implicit conversions usually work just fine here. I've not personally ever run into any issues and, if the target is created using a SELECT INTO or some similar reason, then I'll do the explicit cast outside of the function.  If we could make iTVFs ignore a parameter if it's not included, then maybe I'd add that ability as a parameter but until then, I won't add it.  I'll leave that up to the final user.

    Yep... I was also involved in one of Itzik's "lawn parties" on the subject.  I've needed a "step" ability other than just "1" about once in the last 2 decades and it was so easy to warp that, I don't consider it to be an essential bit of functionality.  To add to that, most people that need such steps need temporal steps and Jonathan's function does an awesome job there.

    I usually try to not make "multipurpose" "base tools" because that generally means that they'll usually run slower.  Of course, that's a bit "old school" when that was a whole lot more true than it is today.  It's like DelimitedSplit8K... it doesn't do multi-character delimiters and it doesn't work on the MAX datatypes because I didn't want to "punish" the function with slowness for infrequently used (by me, anyway) "flexibility".

    And to be sure, I'm not putting down any of these suggestions.  They're all good ideas.  I am, in fact and when I get some time, I'm going to do some testing (which I've not done in a long time) of extremely rapid fire small size returns for this and step and some of the things that came out at Itzik's "lawn party" (and are now coming out here) because they didn't test any of the suggested additions in an extremely rapid fire manner, which is how I normally use fnTally, which means I don't need to take a performance hit of even 1/2 a millisecond per call.

    I just haven't had the time to do so because of the old "I've got bigger fish to fry" thing going on, especially since they've come out with GENERATE_SERIES() after more than a dozen or so years after the original CONNECT item made by Erland Sommarskog and regularly annotated by years truly in favor of such a thing.  I don't get into pre-release stuff though.  They wouldn't listen to me anyway because they haven't in the past and I'm no longer keeping up my MVP status, which gives them even less reason to listen to me. 😀

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 12 posts - 31 through 41 (of 41 total)

You must be logged in to reply to this topic. Login to reply