INT to DATE vs STRING to DATE

  • All,

    I have a thought on converting integer value (whereas date is stored in INT format) to Date. The same column is declared as char(8), so the convert is char to date. Which one will be optimized one? i.e

    sample data: 20160101

    INT to DATE

    CHAR to DATE

    I just this topic few seconds back and posted the question immediately here (without doing any home work). I am also some experts thoughts/ideas/experience on this topic. Meanwhile I will also try from my end and share my thoughts.

    karthik

  • You can't directly convert an INT to a DATE. You could either convert an INT to DATETIME and then to DATE or you could use DATEADD(DAY, <your int field>, 0).

    My first guess would be that the CHAR to DATE would be more efficient, followed by DATEADD, followed by the double CONVERT, but I think that it would be close. Why don't you test it out and report your results back here?

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • Most efficient would be the 3-byte date data type.

    Second would be an integer storing the number of days since January 1, 1900 (use DATEDIFF(d, 0, <input_date_value>) to calculate the value, DATEADD(d, <stored_column_value>, 0) to convert the value to a datetime value. Internally, datetime values are stored as two 4-byte integers: the number of days since 1900-01-01 and the number of 3ms ticks since midnight on that day. Note the value of "Day 0":SELECT CONVERT(DATETIME, 0);-----------------------

    1900-01-01 00:00:00.000

    (1 row(s) affected)

    You can also simply add integer values to dates and have them treated as adding full days:SELECT GETDATE() AS [Today], GETDATE() - 1 AS [Yesterday];Today Yesterday

    ----------------------- -----------------------

    2016-05-13 15:25:29.200 2016-05-12 15:25:29.200

    (1 row(s) affected)This is why an integer like 20160501 can't be converted directly to your date; CONVERT(datetime, 20160501) is asking for a date 20 million days into the future. The correct integer value for 2016-05-01 is actually 42489:SELECT convert(int, convert(datetime, '2016-05-01'));

    Storing an integer value such as 20160501, where visual display logic is forced into the data storage approach, will force you to convert the value to a string, then convert from string to date. 20160501 isn't a number: it's three separate numbers on three different numbering schemes (year = base10, month = base12, day = base28 to base31 variable) string-packed into an integer value. It will provide a clean sort column, but it can only rank behind the string representation of the same value in performance, because it does all the same things with additional overhead.

    Fun with internals 🙂

    -Eddie

    Eddie Wuerch
    MCM: SQL

  • karthik M (5/13/2016)


    All,

    I have a thought on converting integer value (whereas date is stored in INT format) to Date. The same column is declared as char(8), so the convert is char to date. Which one will be optimized one? i.e

    sample data: 20160101

    INT to DATE

    CHAR to DATE

    I just this topic few seconds back and posted the question immediately here (without doing any home work). I am also some experts thoughts/ideas/experience on this topic. Meanwhile I will also try from my end and share my thoughts.

    Recommend you use INT rather than CHAR, using DATEFROMPARTS with INT is almost twice as fast as any character to date conversion.

    😎

    Quick test

    USE TEEST;

    GO

    SET NOCOUNT ON;

    -- /*

    DECLARE @SAMPLE_SIZE INT = 1000000;

    IF OBJECT_ID(N'dbo.TBL_TEST_DATE_CONVERTION') IS NOT NULL DROP TABLE dbo.TBL_TEST_DATE_CONVERTION;

    CREATE TABLE dbo.TBL_TEST_DATE_CONVERTION

    (

    TTDC_ID INT IDENTITY(1,1) NOT NULL CONSTRAINT PK_DBO_TBL_TEST_DATE_CONVERTION_TTDC_ID PRIMARY KEY CLUSTERED

    ,TTDC_INT INT NOT NULL

    ,TTDC_CHAR CHAR(8) NOT NULL

    );

    ;WITH T(N) AS (SELECT N FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) AS X(N))

    , NUMS(N) AS (SELECT TOP(@SAMPLE_SIZE) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS N FROM T T1,T T2,T T3,T T4,T T5,T T6,T T7,T T8,T T9)

    INSERT INTO dbo.TBL_TEST_DATE_CONVERTION(TTDC_INT,TTDC_CHAR)

    SELECT

    CONVERT(INT,CONVERT(VARCHAR(8),DATEADD(DAY,ABS(CHECKSUM(NEWID())) % 100000,0),112),0) AS X

    ,CONVERT(INT,CONVERT(VARCHAR(8),DATEADD(DAY,ABS(CHECKSUM(NEWID())) % 100000,0),112),0) AS Y

    FROM NUMS NM;

    -- */

    DECLARE @timer TABLE (T_TEXT VARCHAR(30) NOT NULL, T_TS DATETIME2(7) NOT NULL DEFAULT(SYSDATETIME()));

    DECLARE @INT_BUCKET INT = 0;

    DECLARE @TINYINT_BUCKET TINYINT = 0;

    DECLARE @DATE_BUCKET DATE = CONVERT(DATE,GETDATE(),0);

    DECLARE @CHAR_BUCKET CHAR(8) = '';

    INSERT INTO @timer(T_TEXT) VALUES('DRY RUN');

    SELECT

    @INT_BUCKET = TC.TTDC_INT

    ,@CHAR_BUCKET = TC.TTDC_CHAR

    FROM dbo.TBL_TEST_DATE_CONVERTION TC;

    INSERT INTO @timer(T_TEXT) VALUES('DRY RUN');

    INSERT INTO @timer(T_TEXT) VALUES('CONVERT CHAR TO DATE');

    SELECT

    @DATE_BUCKET = CONVERT(DATE,TC.TTDC_CHAR,112)

    FROM dbo.TBL_TEST_DATE_CONVERTION TC;

    INSERT INTO @timer(T_TEXT) VALUES('CONVERT CHAR TO DATE');

    INSERT INTO @timer(T_TEXT) VALUES('CONVERT INT TO DATE');

    SELECT

    @DATE_BUCKET = DATEFROMPARTS(

    TC.TTDC_INT / 10000

    ,(TC.TTDC_INT % 10000) / 100

    ,TC.TTDC_INT % 100

    )

    FROM dbo.TBL_TEST_DATE_CONVERTION TC;

    INSERT INTO @timer(T_TEXT) VALUES('CONVERT INT TO DATE');

    INSERT INTO @timer(T_TEXT) VALUES('CONVERT INT TO DATE DATEADD');

    SELECT

    @DATE_BUCKET = CONVERT(DATE,DATEADD(YEAR,(TC.TTDC_INT / 10000) - 1900

    ,DATEADD(MONTH,((TC.TTDC_INT % 10000) / 100) - 1

    ,DATEADD(DAY,((TC.TTDC_INT % 100) -1),0))),0)

    FROM dbo.TBL_TEST_DATE_CONVERTION TC;

    INSERT INTO @timer(T_TEXT) VALUES('CONVERT INT TO DATE DATEADD');

    INSERT INTO @timer(T_TEXT) VALUES('INT TO DATE DATEADD MONTH');

    SELECT

    @DATE_BUCKET = CONVERT(DATE,DATEADD(DAY,TC.TTDC_INT % 100,(DATEADD(MONTH,((TC.TTDC_INT / 10000) - 1900) * 12

    + ((TC.TTDC_INT % 10000) / 100),0))),0)

    FROM dbo.TBL_TEST_DATE_CONVERTION TC;

    INSERT INTO @timer(T_TEXT) VALUES('INT TO DATE DATEADD MONTH');

    INSERT INTO @timer(T_TEXT) VALUES('DRY RUN 2');

    SELECT

    @INT_BUCKET = TC.TTDC_INT

    ,@CHAR_BUCKET = TC.TTDC_CHAR

    FROM dbo.TBL_TEST_DATE_CONVERTION TC;

    INSERT INTO @timer(T_TEXT) VALUES('DRY RUN 2');

    INSERT INTO @timer(T_TEXT) VALUES('CONVERT CHAR TO DATE 2');

    SELECT

    @DATE_BUCKET = CONVERT(DATE,TC.TTDC_CHAR,112)

    FROM dbo.TBL_TEST_DATE_CONVERTION TC;

    INSERT INTO @timer(T_TEXT) VALUES('CONVERT CHAR TO DATE 2');

    INSERT INTO @timer(T_TEXT) VALUES('CONVERT INT TO DATE 2');

    SELECT

    @DATE_BUCKET = DATEFROMPARTS(

    TC.TTDC_INT / 10000

    ,(TC.TTDC_INT % 10000) / 100

    ,TC.TTDC_INT % 100

    )

    FROM dbo.TBL_TEST_DATE_CONVERTION TC;

    INSERT INTO @timer(T_TEXT) VALUES('CONVERT INT TO DATE 2');

    INSERT INTO @timer(T_TEXT) VALUES('CONVERT INT TO DATE DATEADD 2');

    SELECT

    @DATE_BUCKET = CONVERT(DATE,DATEADD(YEAR,(TC.TTDC_INT / 10000) - 1900

    ,DATEADD(MONTH,((TC.TTDC_INT % 10000) / 100) - 1

    ,DATEADD(DAY,((TC.TTDC_INT % 100) -1),0))),0)

    FROM dbo.TBL_TEST_DATE_CONVERTION TC;

    INSERT INTO @timer(T_TEXT) VALUES('CONVERT INT TO DATE DATEADD 2');

    INSERT INTO @timer(T_TEXT) VALUES('INT TO DATE DATEADD MONTH 2');

    SELECT

    @DATE_BUCKET = CONVERT(DATE,DATEADD(DAY,TC.TTDC_INT % 100,(DATEADD(MONTH,((TC.TTDC_INT / 10000) - 1900) * 12

    + ((TC.TTDC_INT % 10000) / 100),0))),0)

    FROM dbo.TBL_TEST_DATE_CONVERTION TC;

    INSERT INTO @timer(T_TEXT) VALUES('INT TO DATE DATEADD MONTH 2');

    SELECT

    T.T_TEXT

    ,DATEDIFF(MICROSECOND,MIN(T.T_TS),MAX(T.T_TS)) AS DURATION

    FROM @timer T

    GROUP BY T.T_TEXT

    ORDER BY DURATION;

    Results (1.33 / 4 Core Atom tablet)

    T_TEXT DURATION

    ------------------------------ -----------

    DRY RUN 2 738086

    DRY RUN 839591

    CONVERT INT TO DATE 963416

    CONVERT INT TO DATE 984083

    INT TO DATE DATEADD MONTH 2 1710921

    CONVERT CHAR TO DATE 1725744

    CONVERT CHAR TO DATE 2 1812689

    INT TO DATE DATEADD MONTH 1841711

    CONVERT INT TO DATE DATEADD 1895011

    CONVERT INT TO DATE DATEADD 2 1986124

  • Eddie Wuerch (5/13/2016)


    Most efficient would be the 3-byte date data type.

    +1.

    I couldn't resist, so I added a DATE column and a query that pulled that to the date_bucket (not being allowed to post the script, so I'll put results for now) 🙂

    T_TEXT DURATION

    ------------------------------ -----------

    CORRECT DATATYPE 182020

    CORRECT DATATYPE 2 185013

    DRY RUN 2 295032

    DRY RUN 307052

    CONVERT INT TO DATE 314033

    CONVERT INT TO DATE 2 328040

    CONVERT CHAR TO DATE 527028

    INT TO DATE DATEADD MONTH 2 546078

    INT TO DATE DATEADD MONTH 591071

    CONVERT CHAR TO DATE 2 632071

    CONVERT INT TO DATE DATEADD 657077

    CONVERT INT TO DATE DATEADD 2 710112

    Cheers!

  • Jacob Wilkins (5/14/2016)


    not being allowed to post the script

    What do you mean by this?

    😎

    Date is obviously better than int or char but the OP was exactly on those conversions to date.

  • Eirikur Eiriksson (5/14/2016)


    Jacob Wilkins (5/14/2016)


    not being allowed to post the script

    What do you mean by this?

    😎

    Date is obviously better than int or char but the OP was exactly on those conversions to date.

    I know what the OP was asking. 🙂

    I just figured it was good to reaffirm what Eddie pointed out. After all, if someone encountered corruption and asked whether doing CHECKDB with REPAIR_ALLOW_DATA_LOSS or detaching and reattaching the DB were the better option, I would probably throw in that neither is a good idea.

    It certainly is interesting to know which of the suboptimal approaches is better, but it's still worth pointing out both are indeed suboptimal.

    As to what I meant by not being allowed to post the script, I meant just that. Including the script prevented my previewing/posting the response (in an overly protective environment at the time of posting).

    Cheers!

  • karthik M (5/13/2016)


    All,

    I have a thought on converting integer value (whereas date is stored in INT format) to Date. The same column is declared as char(8), so the convert is char to date. Which one will be optimized one? i.e

    sample data: 20160101

    INT to DATE

    CHAR to DATE

    I just this topic few seconds back and posted the question immediately here (without doing any home work). I am also some experts thoughts/ideas/experience on this topic. Meanwhile I will also try from my end and share my thoughts.

    Better than that, share your test harness when you're done. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply