Using FORMAT() for Dates

  • Just did a quick test on SQL 2017, performance currently is not getting better, the difference is in the order of the second magnitude:exclamation:
    😎

    USE TEEST;
    GO
    SET NOCOUNT ON;
    DECLARE @SAMPLE_SIZE BIGINT   = 1000000;
    DECLARE @FIRSTDATE DATETIME  = '17991231';
    DECLARE @STR_BUCKET VARCHAR(50) = '';
    DECLARE @DT_BUCKET DATETIME2(7) = '19010101';
    DECLARE @TRES TABLE
    (
      TT_TS DATETIME2(7)     NOT NULL DEFAULT (SYSDATETIME())
     ,TT_TXT VARCHAR(50)      NOT NULL
    );
    IF OBJECT_ID(N'dbo.TBL_TEST_FORMAT') IS NOT NULL DROP TABLE dbo.TBL_TEST_FORMAT;
    CREATE TABLE dbo.TBL_TEST_FORMAT
    (
      TTF_ID INT  NOT NULL CONSTRAINT PK_DBO_TBL_TEST_FORMAT PRIMARY KEY CLUSTERED
     ,TDATE DATETIME NOT NULL
    );
    ;WITH T(N) AS (SELECT X.N FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) X(N))
    , NUMS(N) AS (SELECT TOP(@SAMPLE_SIZE) ROW_NUMBER() OVER (ORDER BY @@VERSION) AS N
         FROM T T1,T T2,T T3,T T4,T T5,T T6
         ORDER BY N)
    INSERT INTO dbo.TBL_TEST_FORMAT(TTF_ID,TDATE)     
    SELECT
     NM.N
    ,DATEADD(DAY,NM.N,@FIRSTDATE) AS TDATE
    FROM  NUMS  NM;
    -- DUMMY
    INSERT INTO @TRES(TT_TXT) VALUES('DUMMY')
      SELECT
       @DT_BUCKET = TD.TDATE
      FROM dbo.TBL_TEST_FORMAT TD
    INSERT INTO @TRES(TT_TXT) VALUES('DUMMY')
    INSERT INTO @TRES(TT_TXT) VALUES('FORMAT_1')
      SELECT
       @STR_BUCKET = FORMAT(TD.TDATE, 'U', 'us-en')
      FROM dbo.TBL_TEST_FORMAT TD
    INSERT INTO @TRES(TT_TXT) VALUES('FORMAT_1')
    INSERT INTO @TRES(TT_TXT) VALUES('CONVERT_1')
      SELECT
       @STR_BUCKET = CONVERT(VARCHAR(20),TD.TDATE, 101)
      FROM dbo.TBL_TEST_FORMAT TD
    INSERT INTO @TRES(TT_TXT) VALUES('CONVERT_1')
    SELECT
      T.TT_TXT
     ,DATEDIFF(MILLISECOND,MIN(T.TT_TS),MAX(T.TT_TS)) AS DURATION
    FROM @TRES T
    GROUP BY T.TT_TXT
    ORDER BY DURATION;

    Results

    TT_TXT    DURATION
    DUMMY     213
    CONVERT_1 333
    FORMAT_1  22454

    Conclution
    DO NOT USE the FORMAT function.

  • Luis Cazares - Thursday, May 24, 2018 9:15 AM

    I've never understood the option to save programmer time by not caring about computing time. An extra hour spent by a programmer on writing good code could save several hours a week when the process is released to production. I've seen it and I'm being very conservative as improvements can be millions to one when comparing hours spent in processing vs programming.

    That's because you're looking at the happy path. The alternative path is often programmers spending weeks writing meh code that doesn't necessarily improve anything. Or they spend lots of time trying to tune code to make it better, but the improvement isn't worth the investment.

    Those happen. One reason why people trade hardware for labor. Often that's a better ROI.

    Not always, and it's a judgment on when one is better.

  • Steve Jones - SSC Editor - Thursday, May 24, 2018 9:25 AM

    Jeff Moden - Thursday, May 24, 2018 5:21 AM

    Nice write-up but, even with the warnings, I'm reminded of people that justify poor programming techniques or functionality because of low row counts.  It's a bit like teaching people how to use a gun that looks like this.

    I guess, but what's the alternative? Don't talk about it? Let people stumble on it? I'd like to think they stumble on my article and read the warning. At least then it might stick in their mind when they do something and it doesn't perform.

    You're welcome to write a , don't use FORMAT(), Do This article, but complaints don't get read or spread as widely as information.

    I agree with this. I prefer that people looking for information on how to use FORMAT get this warning instead of just missing it entirely.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Eirikur Eiriksson - Thursday, May 24, 2018 9:26 AM

    Just did a quick test on SQL 2017, performance currently is not getting better, the difference is in the order of the second magnitude:exclamation:
    😎

    Conclution
    DO NOT USE the FORMAT function.

    I'm sure that's the case. Didn't hear/see anything in 2017 that was fixing language items.

  • Luis Cazares - Thursday, May 24, 2018 9:15 AM

    patrickmcginnis59 10839 - Thursday, May 24, 2018 7:59 AM

    Jeff Moden - Thursday, May 24, 2018 5:21 AM

    Nice write-up but, even with the warnings, I'm reminded of people that justify poor programming techniques or functionality because of low row counts.  It's a bit like teaching people how to use a gun that looks like this.

    Steve gave the performance warning. Sometimes programmer time is traded for computer time, and often workloads are low and predictably so. I think there's a school of thought that every bit of code needs to be as optimized as possible and that's fine for where it fits, and given T-SQL's bad interpretive performance its a good idea to be extra cautious with SQL Server. 

    Sometimes your code simply cannot avoid the iterative penalty of T-SQL and in those cases format wouldn't be disqualified out of hand because its performance problems are dwarfed by the so called "RBAR" hit.

    I've never understood the option to save programmer time by not caring about computing time. An extra hour spent by a programmer on writing good code could save several hours a week when the process is released to production. I've seen it and I'm being very conservative as improvements can be millions to one when comparing hours spent in processing vs programming.

    Yeah if you are spending several hours a week because of format then that's what Steve cautioned against by warning about format's performance cost. I can also understand that there are folks who do not like trading programmer time for computer time, but often they change their tune when programmer time gets unaffordable. Format could be useful when the cost of using it has little impact, but this would probably NOT be the case if you are processing enough records to where the impact becomes a cost worth considering.

    I know folks don't understand programmer time / computer time tradeoffs, but that's up to you. If programmer time is free then have at it!

  • I'll also add that I've worked on lots of systems that trundled along at 8% CPU and few IOPS all day long. Those systems could be written with cursor processing of child/parent relationships and FORMAT() everywhere and do fine. Some of them likely do with EF code in the front end.

    FORMAT() has a place. A limited place, but still, it could be good

  • patrickmcginnis59 10839 - Thursday, May 24, 2018 9:47 AM

    Yeah if you are spending several hours a week because of format then that's what Steve cautioned against by warning about format's performance cost. I can also understand that there are folks who do not like trading programmer time for computer time, but often they change their tune when programmer time gets unaffordable. Format could be useful when the cost of using it has little impact, but this would probably NOT be the case if you are processing enough records to where the impact becomes a cost worth considering.

    I know folks don't understand programmer time / computer time tradeoffs, but that's up to you. If programmer time is free then have at it!

    Just remember, the time that you saved today will be charged with interests when performance becomes an issue. Basically, you're releasing defects into production.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Steve Jones - SSC Editor - Thursday, May 24, 2018 9:58 AM

    I'll also add that I've worked on lots of systems that trundled along at 8% CPU and few IOPS all day long. Those systems could be written with cursor processing of child/parent relationships and FORMAT() everywhere and do fine. Some of them likely do with EF code in the front end.

    FORMAT() has a place. A limited place, but still, it could be good

    Is that how you did the anti-spam for this website?
    😎

  • Luis Cazares - Thursday, May 24, 2018 10:01 AM

    patrickmcginnis59 10839 - Thursday, May 24, 2018 9:47 AM

    Yeah if you are spending several hours a week because of format then that's what Steve cautioned against by warning about format's performance cost. I can also understand that there are folks who do not like trading programmer time for computer time, but often they change their tune when programmer time gets unaffordable. Format could be useful when the cost of using it has little impact, but this would probably NOT be the case if you are processing enough records to where the impact becomes a cost worth considering.

    I know folks don't understand programmer time / computer time tradeoffs, but that's up to you. If programmer time is free then have at it!

    Just remember, the time that you saved today will be charged with interests when performance becomes an issue. Basically, you're releasing defects into production.

    Yeah but chances are I can afford to get the salt myself!

  • Eirikur Eiriksson - Thursday, May 24, 2018 10:09 AM

    Is that how you did the anti-spam for this website?
    😎

    Ha ha. No, and I didn't do it.

    It's a harder problem than you realize, though we certainly could improve.

  • Steve Jones - SSC Editor - Thursday, May 24, 2018 9:58 AM

    I'll also add that I've worked on lots of systems that trundled along at 8% CPU and few IOPS all day long. Those systems could be written with cursor processing of child/parent relationships and FORMAT() everywhere and do fine. Some of them likely do with EF code in the front end.

    FORMAT() has a place. A limited place, but still, it could be good

    Nice article, Steve. And yes, absolutely there is a place for such functionality. Not every operation is executed frequently or against large datasets. Not all projects have appropriate dev resources to handle such things in the app code, just like not all projects have appropriate DB developer resources to do proper data modeling and/or coding.

    Besides, trying to ignore something doesn't make it go away, so providing information, along with appropriate warnings, such as you have done, is the best approach. The fact that something can be used incorrectly, or inappropriately, is not a valid reason for not having that thing exist. Should we take away Dynamic SQL, LEFT JOINs, and pretty much anything that has ever existed?

    ---

    Regarding the statement of " SSMS seems to think this is a NVARCHAR(250) ", the FORMAT function actually passes back NVARCHAR(4000). You can see this by either doing a SELECT INTO or by using the SQL_VARIANT_PROPERTY function as follows:

    SELECT
      FORMAT(GETDATE(), 'd'), -- 5/24/2018
      SQL_VARIANT_PROPERTY(FORMAT(GETDATE(), 'd'), 'BaseType'), -- nvarchar
      SQL_VARIANT_PROPERTY(FORMAT(GETDATE(), 'd'), 'MaxLength'); -- 8000

    It most likely passes back the full 4000 because the format string can contain extra stuff and could technically be 4000 characters. For example:

    SELECT
      FORMAT(GETDATE(), 'lalala $$$$$ d ----------- M ---------- yyyy >>>>>>>>> mm ... MMM');

    -- lalala $$$$$ 24 ----------- 5 ---------- 2018 >>>>>>>>> 50 ... May

    And while you did mention that the FORMAT built-in function was added in SQL Server 2012, I will mention that it is possible to get this functionality in SQL Server 2005, 2008, and 2008 R2 (for those still stuck on those versions) via SQLCLR. The SQL# SQLCLR library (that I wrote, and this function is available in the free version) comes with a "Date_Format" function that does the same thing that the built-in FORMAT function does, and it works on SQL Server 2005, 2008, and 2008 R2. The following statement (which requires SQL Server 2012 or newer as it compares both functions)

    SELECT
      SQL#.Date_Format('2018-05-24', N'<yyyy> [MMM]', 'he'),
      FORMAT(CONVERT(DATE, '2018-05-24'), N'<yyyy> [MMM]', 'he');

    returns the same value for both columns:

    <2018> [מ××™]

    Take care, Solomon...

    SQL#https://SQLsharp.com/ ( SQLCLR library ofover 340 Functions and Procedures)
    Sql Quantum Lifthttps://SqlQuantumLift.com/ ( company )
    Sql Quantum Leaphttps://SqlQuantumLeap.com/ ( blog )
    Info sitesCollations     •     Module Signing     •     SQLCLR

  • Steve Jones - SSC Editor - Thursday, May 24, 2018 9:58 AM

    I'll also add that I've worked on lots of systems that trundled along at 8% CPU and few IOPS all day long. Those systems could be written with cursor processing of child/parent relationships and FORMAT() everywhere and do fine. Some of them likely do with EF code in the front end.

    FORMAT() has a place. A limited place, but still, it could be good

    Heh.... so if you only lived 4 miles from work, you wouldn't mind paying $20 a gallon for gas? 😉

    I guess I'll never understand why a lot of people excuse poor code or poor tools based on low usage or low row counts.  These and other forums are littered with posts where such things happened and now are in serious trouble with performance.   It takes no extra time to just do it right from the beginning.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Steve Jones - SSC Editor - Thursday, May 24, 2018 9:25 AM

    Jeff Moden - Thursday, May 24, 2018 5:21 AM

    Nice write-up but, even with the warnings, I'm reminded of people that justify poor programming techniques or functionality because of low row counts.  It's a bit like teaching people how to use a gun that looks like this.

    I guess, but what's the alternative? Don't talk about it? Let people stumble on it? I'd like to think they stumble on my article and read the warning. At least then it might stick in their mind when they do something and it doesn't perform.

    You're welcome to write a , don't use FORMAT(), Do This article, but complaints don't get read or spread as widely as information.

    BWAAAHAAA!!! See?  You actually DO know better so why didn't YOU write such an article instead of the one you wrote? 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Solomon Rutzky - Thursday, May 24, 2018 2:59 PM

    Steve Jones - SSC Editor - Thursday, May 24, 2018 9:58 AM

    I'll also add that I've worked on lots of systems that trundled along at 8% CPU and few IOPS all day long. Those systems could be written with cursor processing of child/parent relationships and FORMAT() everywhere and do fine. Some of them likely do with EF code in the front end.

    FORMAT() has a place. A limited place, but still, it could be good

    Nice article, Steve. And yes, absolutely there is a place for such functionality. Not every operation is executed frequently or against large datasets. Not all projects have appropriate dev resources to handle such things in the app code, just like not all projects have appropriate DB developer resources to do proper data modeling and/or coding.

    Besides, trying to ignore something doesn't make it go away, so providing information, along with appropriate warnings, such as you have done, is the best approach. The fact that something can be used incorrectly, or inappropriately, is not a valid reason for not having that thing exist. Should we take away Dynamic SQL, LEFT JOINs, and pretty much anything that has ever existed?

    ---

    Regarding the statement of " SSMS seems to think this is a NVARCHAR(250) ", the FORMAT function actually passes back NVARCHAR(4000). You can see this by either doing a SELECT INTO or by using the SQL_VARIANT_PROPERTY function as follows:

    SELECT
      FORMAT(GETDATE(), 'd'), -- 5/24/2018
      SQL_VARIANT_PROPERTY(FORMAT(GETDATE(), 'd'), 'BaseType'), -- nvarchar
      SQL_VARIANT_PROPERTY(FORMAT(GETDATE(), 'd'), 'MaxLength'); -- 8000

    It most likely passes back the full 4000 because the format string can contain extra stuff and could technically be 4000 characters. For example:

    SELECT
      FORMAT(GETDATE(), 'lalala $$$$$ d ----------- M ---------- yyyy >>>>>>>>> mm ... MMM');

    -- lalala $$$$$ 24 ----------- 5 ---------- 2018 >>>>>>>>> 50 ... May

    And while you did mention that the FORMAT built-in function was added in SQL Server 2012, I will mention that it is possible to get this functionality in SQL Server 2005, 2008, and 2008 R2 (for those still stuck on those versions) via SQLCLR. The SQL# SQLCLR library (that I wrote, and this function is available in the free version) comes with a "Date_Format" function that does the same thing that the built-in FORMAT function does, and it works on SQL Server 2005, 2008, and 2008 R2. The following statement (which requires SQL Server 2012 or newer as it compares both functions)

    SELECT
      SQL#.Date_Format('2018-05-24', N'<yyyy> [MMM]', 'he'),
      FORMAT(CONVERT(DATE, '2018-05-24'), N'<yyyy> [MMM]', 'he');

    returns the same value for both columns:

    <2018> [מ××™]

    Take care, Solomon...

    I absolutely agree that sharing information is important.  So here goes.... there is no place where someone should use code that is a known performance problem when there are easy alternatives.  This could have been a must-read article instead of teaching how to use performance challenged code.  Justifying its use by saying it'll save on developer time also means there's a bigger problem... you haven't taught the developer the right way and they'll never get better at it because they think they're doing it the right way.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Eirikur Eiriksson - Thursday, May 24, 2018 8:17 AM

    Jeff Moden - Thursday, May 24, 2018 8:02 AM

    C'mon... such a thing is easy to scale up. 😀

    You owe me yet another keyboard Jeff (not charging for the pint I spilled over it)
    😎

    Where do you find such "clever" images? Could be very useful for some presentations!
    BTW, I like more to design to a scale out, any suggestion?
    Image result for self destruction mechanism

    No problem...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 16 through 30 (of 45 total)

You must be logged in to reply to this topic. Login to reply