Blog Post

Converting Dates

,

I ran across a post recently where someone had dates stored as characters (never good), but also in this format: CYYMMDD. I’d never seen that, and thought it was strange. The person was asking for a way to convert this to YYYY-MM-DD, which I think it fairly easy.

Let’s set up some data:

CREATE TABLE RandomDates
(
    myid INT
    , mydate VARCHAR(7)
);
GO
INSERT dbo.RandomDates
        ( myid
        , mydate
        )
    VALUES
        ( 1, '0600102' )
       , ( 2, '1121004' )
       , ( 3, '0920415' )
       , ( 4, '1040611' )
        ;
GO

The format for the data is 7 characters denoted as century, year, month, day. The century is encoded, with the single value representing:

  • 0 = 1900
  • 1 = 2000

That’s pretty straightforward and it means that we have two dates in the 1900s and 2 in the 2000s. The dates are:

  • January 2, 1960
  • April 15, 1992
  • June 11, 2004
  • October 4, 2012

Querying these dates and converting them is straightforward, but there are a couple ways to do this. If we are just looking to convert these to characters, I saw this solution from ZZartin.

 SELECT CONVERT(DATETIME, CASE 
                            WHEN LEFT(mydate, 1) = '1' 
                                THEN '20' 
                                ELSE '19' 
                            END 
                            + RIGHT(rd.mydate, 6)
                , 112)
  FROM dbo.RandomDates AS rd;

That’s fairly straightforward, and overall I like it. It uses simple functions and puts things together.

I had another idea, mostly because I initially favored keeping the items separate as parts of the date in case I needed them. I thought about DATEFROMPARTS, which is a SQL Server 2012+ function. My thought was to calculate each of the parts and send them into the function like this:

SELECT DATEFROMPARTS(
                    CASE WHEN LEFT(mydate, 1) = '0'
                        THEN 1900
                        ELSE 2000
                        END + SUBSTRING( rd.mydate, 2, 2)
                    , SUBSTRING( rd.mydate, 4, 2)
                    , SUBSTRING( rd.mydate, 6, 2)
                    )
 FROM dbo.RandomDates AS rd
 ;

My idea has more function calls, and I’d think it would take longer to build, but I’m not sure. Let’s test.

I’ll use Data Generator to insert a few million rows into this table. Then let’s run both pieces of code.

The first code, from ZZartin, required about 19,000 logical reads and this execution plan when I ran it a few times. The CPU execution was in the 40k ms, and about 9 sec of real time.

dates1

The second code, mine, had about 30,000 logical reads, with only 8k CPU ms, but about 40sec of real time. The execution plan:

dates2

That’s interesting, and it matches what I’d expect. The first code is much simpler, intuitively, and it’s easy to read. With the format of the date essentially in order, it doesn’t make sense to try and "assemble" the date from parts. It’s easier to convert the first character to two (with the CASE) and then just cast this as a date.

Filed under: Blog Tagged: syndicated, T-SQL

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating