Dealing with custom date formats in T-SQL

Question

Dealing with custom date formats in T-SQL

Viewing 15 posts - 16 through 30 (of 33 total)

You must be logged in to reply to this topic. Login to reply

Eric M Russell SSC Guru Points: 125596 More actions · Answer 1

Formatting dates in the presentation layer is best, because how the application wants dates formatted and how reporting wants dates formatted will be different, even if you get them to initially agree on a standard.

Still, formatting dates within the SELECT statement is tolerable and sometimes necessary.

However, containing date/time values in VARCHAR columns is perhaps one of the worst database anti-patterns ever. Not only does it require more effort to insure that all applications INSERT using a consistent format, but VARCHAR "dates" consume more storage space, and depending on the format, it often requires the use non-indexable conversion expressions in the WHERE clause.

"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

spaghettidba SSC Guru Points: 105732 More actions · Answer 2

Eric M Russell (3/23/2012)
Formatting dates in the presentation layer is best, because how the application wants dates formatted and how reporting wants dates formatted will be different, even if you get them to initially agree on a standard.
Still, formatting dates within the SELECT statement is tolerable and sometimes necessary.
However, containing date/time values in VARCHAR columns is perhaps one of the worst database anti-patterns ever. Not only does it require more effort to insure that all applications INSERT using a consistent format, but VARCHAR "dates" consume more storage space, and depending on the format, it often requires the use non-indexable conversion expressions in the WHERE clause.

I completely agree.

Not to mention the inability to sort properly, perform date/time specific calculations, check against invalid values (february 29 on non-leap years etc.)... and I could go on forever!

Another date/time anti-pattern that I saw implemented in some shops is a separate column for each date part: one (int!!!!!!!) column for the year, one column for the month etc... Fits well with a DWH calendar table, but is a horrible choice for OLTP.

-- Gianluca Sartori

Eric M Russell SSC Guru Points: 125596 More actions · Answer 3

Gianluca Sartori (3/23/2012)
Eric M Russell (3/23/2012)
Formatting dates in the presentation layer is best, because how the application wants dates formatted and how reporting wants dates formatted will be different, even if you get them to initially agree on a standard.
Still, formatting dates within the SELECT statement is tolerable and sometimes necessary.
However, containing date/time values in VARCHAR columns is perhaps one of the worst database anti-patterns ever. Not only does it require more effort to insure that all applications INSERT using a consistent format, but VARCHAR "dates" consume more storage space, and depending on the format, it often requires the use non-indexable conversion expressions in the WHERE clause.
I completely agree.
Not to mention the inability to sort properly, perform date/time specific calculations, check against invalid values (february 29 on non-leap years etc.)... and I could go on forever!
Another date/time anti-pattern that I saw implemented in some shops is a separate column for each date part: one (int!!!!!!!) column for the year, one column for the month etc... Fits well with a DWH calendar table, but is a horrible choice for OLTP.

When confronted with a VARCHAR date, it is still possible for us to impose our will upon it at the DDL level. The following check constraint not only requires the value to be coded in the format YYYYMMDD, but it will require that the value be a valid calendar date. For example, it rejects an insert of February 29, 2011 but accepts February 29, 2012.

create table foo

(

foo_date varchar(30) not null

constraint ck_foo_date_yyyymmdd

check (foo_date = convert(char(8),cast(foo_date as datetime),112))

);

insert into foo (foo_date) values ('2011/02/28');

Error 547: The INSERT statement conflicted with the CHECK constraint "ck_foo_date_yyyymmdd".

:satisfied:

insert into foo (foo_date) values ('20110229');

Error 242: The conversion of a varchar data type to a datetime data type resulted in an out-of-range value.

:satisfied:

insert into foo (foo_date) values ('20120229');

(1 row(s) affected)

"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

gclausen SSC Enthusiast Points: 112 More actions · Answer 4

Very nice article. I was especially interested where you showed that CLR methods performed much better. I've heard this about CLR performing better in most cases.,

spaghettidba SSC Guru Points: 105732 More actions · Answer 5

gclausen (3/23/2012)
Very nice article. I was especially interested where you showed that CLR methods performed much better. I've heard this about CLR performing better in most cases.,

Well, it depends on the task you have to accomplish.

When it comes down to complex string manipulations, CLR is nearly always the best choice.

The same can't be said for data manipulation.

-- Gianluca Sartori

gclausen SSC Enthusiast Points: 112 More actions · Answer 6

gclausen

SSC Enthusiast

Points: 112

March 23, 2012 at 8:28 am

#1463429

Agree - it just depends on the task at hand.

sknox SSChampion Points: 12388 More actions · Answer 7

Very nice article. It provides a set of useful solutions to a common issue, while also highlighting the fact that some common "mantras" (i.e, Inline TVFs are more efficient than scalar UDFs) are not always true.

I'd have liked to see some charts showing how the different solutions scale from 1 to millions of rows, but that's really not something I feel should be required of a short article like this, and you've provided enough information for me to test this on my own if I want to take the time.

Finally one question: did you do your testing on a preview or RTM version of SQL Server 2012? If it's a preview version, there might be some debugging code slowing down the new function.

spaghettidba SSC Guru Points: 105732 More actions · Answer 8

sknox (3/23/2012)
Very nice article. It provides a set of useful solutions to a common issue, while also highlighting the fact that some common "mantras" (i.e, Inline TVFs are more efficient than scalar UDFs) are not always true.
I'd have liked to see some charts showing how the different solutions scale from 1 to millions of rows, but that's really not something I feel should be required of a short article like this, and you've provided enough information for me to test this on my own if I want to take the time.
Finally one question: did you do your testing on a preview or RTM version of SQL Server 2012? If it's a preview version, there might be some debugging code slowing down the new function.

Ha! Nice catch.

No, I tested the code with CTP3 and RC0, because RTM was not ready yet at the time.

As far as the chart is concerned, 20 seconds of Excel produced this:

-- Gianluca Sartori

Steven Willis SSCrazy Eights Points: 9893 More actions · Answer 9

Excellent functions. I've already tested them on my own server (2008) and added them to my toolbox of functions. In case you are interested, I added a tiny modification to both functions to handle Day Name Suffixes such as "1st," "2nd," "3rd," and "4th." This code piece can just be added right at the end of the "formattedDate" SELECT statement.

WHEN 'x' THEN

CASE

WHEN CAST(DAY(@date) AS INT) IN (1,21,31) THEN N'st'

WHEN CAST(DAY(@date) AS INT) IN (2,22) THEN N'nd'

WHEN CAST(DAY(@date) AS INT) IN (3,23) THEN N'rd'

ELSE N'th'

END

Now a mask like this can be used 'DDDD, MMMM DDx, YYYY' to produce 'Friday, March 23rd, 2012'.

Also, the difference between 'hh' and 'h' wasn't clear and didn't seem to produce different results no matter what dates I used for input. Since the 'MM' vs 'M', 'DD' vs 'D', 'HH' vs 'H' all produced the month or day with or without a leading zero respectively, I would assume the same would be the case for 'hh' and 'h'. So I modified the code as:

WHEN 'h' THEN

CASE WHEN LEFT(

RIGHT('00' + CAST(

CASE DATEPART(hour,@date)

WHEN 12 THEN 12

ELSE DATEPART(hour,@date) % 12

END AS NVARCHAR(2)),2),1) = 0 THEN

RIGHT('00' + CAST(

CASE DATEPART(hour,@date)

WHEN 12 THEN 12

ELSE DATEPART(hour,@date) % 12

END AS NVARCHAR(2)),1)

ELSE

RIGHT('00' + CAST(

CASE DATEPART(hour,@date)

WHEN 12 THEN 12

ELSE DATEPART(hour,@date) % 12

END AS NVARCHAR(2)),2)

END

Perhaps there is a more efficient way of doing this, but I basically just took the code for 'hh' formatting and turned it into a case statement.

Now, to be thorough, I also added to the masking function of the ParseString function to allow for filtering out Day Name suffixes:

Added to the 'allowedTokens'

UNION ALL

SELECT

'x'

,'DAYSUFFIX'

Added to the 'pivotedTokens'

,daysuffix_value = ''

and

,[DAYSUFFIX]

Added to the 'verifiedTokens' pivot list

,[DAYSUFFIX]

WITHOUT these changes this: 'October 26th 2011', 'MMMM DD YYYY' will cause an error

WITH these changes this: 'October 26th 2011', 'MMMM DDx YYYY' parses the date successfully

Lisa Slater Nicholls Hall of Fame Points: 3461 More actions · Answer 10

charles.southey (3/23/2012)
Yes - it's relevant to the article and provides a possible solution for those grappling with the problem under discussion that they may be looking for. We would never post something that is not directly relevant. Not everyone wants a commercial solution - but I think it's worth knowing that it's there if you do. BTW the product is perpetually free for use on Developer Edition.

OK, Charles, since you're here with your potential solution... how does your product stack up against the author's custom CLR function and the rest?

When doing exactly the task(s) described in the article, how long does it take?

I'm sure there are other potential comparisons (flexibility, ease of use, etc), and if you discussed those here I'd consider it an advertisement for sure.

But speed is really a comparison that would add to this conversation, IMHO, without being an "ad". The perf disparity between FORMAT() and the custom function is very interesting and has made me very curious. Is there some general pattern here, what can we learn?

Thanks in advance,

>L<

Jeff Moden SSC Guru Points: 1004619 More actions · Answer 11

Lisa Slater Nicholls (3/23/2012)
charles.southey (3/23/2012)
Yes - it's relevant to the article and provides a possible solution for those grappling with the problem under discussion that they may be looking for. We would never post something that is not directly relevant. Not everyone wants a commercial solution - but I think it's worth knowing that it's there if you do. BTW the product is perpetually free for use on Developer Edition.
OK, Charles, since you're here with your potential solution... how does your product stack up against the author's custom CLR function and the rest?
When doing exactly the task(s) described in the article, how long does it take?
I'm sure there are other potential comparisons (flexibility, ease of use, etc), and if you discussed those here I'd consider it an advertisement for sure.
But speed is really a comparison that would add to this conversation, IMHO, without being an "ad". The perf disparity between FORMAT() and the custom function is very interesting and has made me very curious. Is there some general pattern here, what can we learn?
Thanks in advance,
>L<

Oh My! There's another "me" out there! I was going to ask Charles for his performance stats on 10,000, 100,000, and 1 million rows of dates and even offer to build the test table for him. Well done, Lisa!

The only bad part about Charles doing the testing is having the proverbial mouse guard the cheese.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004619 More actions · Answer 12

Gianluca, my ol' and trusted friend,

Very nice and well written article with lots of well fomatted, easy to read, documented code. Nicely done.

Shifting gears, I have a couple of questions, please.

1. What is the configuration and speed of the computer you did the testing on?

2. Which edition and version of SQL Server and Windows did you run your tests on?

3. When you did your duration testing, where did you have the result set return to? A table? Grid mode on the screen? A throw-away variable? or ???

4. When you did your duration testing, how did you measure the duration? STATISTICS TIME ON? SQL Profiler? or ???

5. When you did your duration testing, what was the configuration of your test table? Single column of dates? Indexes or not?

6. Is there any chance of you attaching your test data generator and your test harness to this thread so I can make sure that my testing is done the same way you did it?

7. Since I don't even know how to spell C#, would it be possible for you to script the CLR assembly out in a "Paul White" fashion so that I could install it on my machine for a little testing of my own?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Lisa Slater Nicholls Hall of Fame Points: 3461 More actions · Answer 13

The only bad part about Charles doing the testing is having the proverbial mouse guard the cheese.

Actually, Jeff, in this case, I look at it as "put up or shut up" ;-).

Regardless of the author's original computer specs (although that made me curious too, thanks for asking hiim), I'd like to see the vendor run the code in this article, on any machine or set of machines, plus his own version, and provide all stats on a level playing field.

Here's a case where we *know* the test was not vendor-designed. If Charles runs exactly that code, plus his, in like conditions (and I invite you to add conditions: start sql server up cold? anything?), we should really learn something.

>L<

TomThomson SSC Guru Points: 104773 More actions · Answer 14

Really good article, I like it very much. So please don't take the following as suggesting that it isn't, because that would be utterly wrong.

There are a couple of things that perhaps should have been mentioned (to warn people that they need to be considered when relevant, not to offer solutions in the article because covering everyting would make the article too long and complicated and anyway most people don't need to worry about either of these things).

1) Everything is varchar. But I may want to be able to handle languages that don't work with varchar (256 is quite a small number, there are non-alphabetic languages that have to be coped with and 256 characters is sometimes far too few). I don't think there's any need to cope with more than one language at a time, but if there were that would make it even harder to fit into varchar.

2) None of your functions will work if you are outside the Gregorian calendar. We can use T-SQL cast/convert to get Kuwaiti Hijri, as for example

select convert(varchar(32),CONVERT(datetime2,'2012-03-23 21:00:00.000'),131)

using SQL Server's capability to handle Kuwaiti dates (but I wouldn't bet on it delivering correct dates in the future, as one form or another of fundamentalist may succede in forcing a change in Kuwaiti dates). But this will not work for Saudi format, or Iranian format, or.... (the year numbers will be different hardly ever, the month numbers will be different a little of the time, the day numbers are likely to be different a lot of the time). So the argument that built in date handling by CONVERT solves the Hijri problem doesn't work, because only Kuwaiti dates can be handled. Islamic Hijri isn't the only calendar in current use whose months and years don't match up with the Gregorian calendar, even the Julian calendar would be a trivial example if anyone actually used it any more.

So people need to watch out for alphabets which don't have 256 or fewer characters and for calendars which have different (possibly variable) year lengths and different (possibly variable) month lengths from our Western culture.

Tom

spaghettidba SSC Guru Points: 105732 More actions · Answer 15

Steven Willis (3/23/2012)
Excellent functions. I've already tested them on my own server (2008) and added them to my toolbox of functions. In case you are interested, I added a tiny modification to both functions to handle Day Name Suffixes such as "1st," "2nd," "3rd," and "4th." This code piece can just be added right at the end of the "formattedDate" SELECT statement.
...
Now a mask like this can be used 'DDDD, MMMM DDx, YYYY' to produce 'Friday, March 23rd, 2012'.

Thank you, Steven. This could be a valuable addition to the code.

However, I tried to stick to the date format placeholders you can find on this page on MSDN. I tried parsing a date containing the day name suffix with the built-in PARSE function and it did not work.

Also, the difference between 'hh' and 'h' wasn't clear and didn't seem to produce different results no matter what dates I used for input. Since the 'MM' vs 'M', 'DD' vs 'D', 'HH' vs 'H' all produced the month or day with or without a leading zero respectively, I would assume the same would be the case for 'hh' and 'h'. So I modified the code as:
WHEN 'h' THEN
CASE WHEN LEFT(
RIGHT('00' + CAST(
CASE DATEPART(hour,@date)
WHEN 12 THEN 12
ELSE DATEPART(hour,@date) % 12
END AS NVARCHAR(2)),2),1) = 0 THEN
RIGHT('00' + CAST(
CASE DATEPART(hour,@date)
WHEN 12 THEN 12
ELSE DATEPART(hour,@date) % 12
END AS NVARCHAR(2)),1)
ELSE
RIGHT('00' + CAST(
CASE DATEPART(hour,@date)
WHEN 12 THEN 12
ELSE DATEPART(hour,@date) % 12
END AS NVARCHAR(2)),2)
END
Perhaps there is a more efficient way of doing this, but I basically just took the code for 'hh' formatting and turned it into a case statement.

Nice catch, thank you.

-- Gianluca Sartori

Dealing with custom date formats in T-SQL

Cookies on SQLServerCentral