Convert DD MMM YYYY to yyyymmdd in string

Question

Convert DD MMM YYYY to yyyymmdd in string

dc.pc100

Mr or Mrs. 500

Points: 571
More actions
September 29, 2013 at 6:21 pm

#277594

Hi,
I am struggling with changing the date format of dd MMM YYYY in the sample strings below into yyymmdd
Sample Data Input
Column Name:
SourceName
12 Jan 2013 Test.docx
01 February 2001 File.pptx
Meeting 04 Feb 2012.xls
09 September 2011.txt
30 Jan 13.doc
Sample Data Output
Column Name:
SourceName
20130112 Test.docx
20010201 File.pptx
Meeting 20120204.xls
20110911.txt
20130130.doc

Viewing 15 posts - 1 through 15 (of 19 total)

You must be logged in to reply to this topic. Login to reply

Dwain Camps SSC Guru Points: 86908 More actions · Answer 1

Post retracted. Hang on for a replacement.

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

Dwain Camps SSC Guru Points: 86908 More actions · Answer 2

Ok. This should work.

But someone I'm sure will probably come up with a better way.

WITH SampleData (sourcedata) AS

(

SELECT '12 Jan 2013 Test.docx'

UNION ALL SELECT '01 February 2001 File.pptx'

UNION ALL SELECT 'Meeting 04 Feb 2012.xls'

UNION ALL SELECT '09 September 2011.txt'

UNION ALL SELECT '30 Jan 13.doc'

)

SELECT sourcedata, sourcedata2=STUFF(sourcedata, pos1, (pos2+4)-pos1

,yy + COALESCE(m1, m2, m3, m4, m5, m6, m7, m8, m9, m10, m11, m12) + dd)

FROM SampleData a

CROSS APPLY

(

SELECT pos1=PATINDEX('%[0-9][0-9]%', sourcedata)

,x=CASE PATINDEX('%[0-9][0-9][0-9][0-9]%', sourcedata) WHEN 0 THEN 2 ELSE 4 END

) b

CROSS APPLY

(

SELECT pos2=CASE x

WHEN 4 THEN PATINDEX('%[0-9][0-9][0-9][0-9]%', sourcedata)

ELSE PATINDEX('%[0-9][0-9]%', RIGHT(sourcedata, LEN(sourcedata)-(pos1+1))) END

) c

CROSS APPLY

(

SELECT dd=SUBSTRING(sourcedata, pos1, 2)

,yy=CASE x WHEN 4 THEN SUBSTRING(sourcedata, pos2, x)

ELSE '20' + SUBSTRING(sourcedata, 2+pos2, x)

END

) d

CROSS APPLY

(

SELECT m1=RIGHT(100+NULLIF(1*SIGN(CHARINDEX('Jan', sourcedata)), 0), 2)

,m2=RIGHT(100+NULLIF(2*SIGN(CHARINDEX('Feb', sourcedata)), 0), 2)

,m3=RIGHT(100+NULLIF(3*SIGN(CHARINDEX('Mar', sourcedata)), 0), 2)

,m4=RIGHT(100+NULLIF(4*SIGN(CHARINDEX('Apr', sourcedata)), 0), 2)

,m5=RIGHT(100+NULLIF(5*SIGN(CHARINDEX('May', sourcedata)), 0), 2)

,m6=RIGHT(100+NULLIF(6*SIGN(CHARINDEX('Jun', sourcedata)), 0), 2)

,m7=RIGHT(100+NULLIF(7*SIGN(CHARINDEX('Jul', sourcedata)), 0), 2)

,m8=RIGHT(100+NULLIF(8*SIGN(CHARINDEX('Aug', sourcedata)), 0), 2)

,m9=RIGHT(100+NULLIF(9*SIGN(CHARINDEX('Sep', sourcedata)), 0), 2)

,m10=RIGHT(100+NULLIF(10*SIGN(CHARINDEX('Oct', sourcedata)), 0), 2)

,m11=RIGHT(100+NULLIF(11*SIGN(CHARINDEX('Nov', sourcedata)), 0), 2)

,m12=RIGHT(100+NULLIF(12*SIGN(CHARINDEX('Dec', sourcedata)), 0), 2)

) e;

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

Jeff Moden SSC Guru Points: 1003896 More actions · Answer 3

dwain.c (9/29/2013)
Ok. This should work.
But someone I'm sure will probably come up with a better way.
WITH SampleData (sourcedata) AS
(
SELECT '12 Jan 2013 Test.docx'
UNION ALL SELECT '01 February 2001 File.pptx'
UNION ALL SELECT 'Meeting 04 Feb 2012.xls'
UNION ALL SELECT '09 September 2011.txt'
UNION ALL SELECT '30 Jan 13.doc'
)
SELECT sourcedata, sourcedata2=STUFF(sourcedata, pos1, (pos2+4)-pos1
,yy + COALESCE(m1, m2, m3, m4, m5, m6, m7, m8, m9, m10, m11, m12) + dd)
FROM SampleData a
CROSS APPLY
(
SELECT pos1=PATINDEX('%[0-9][0-9]%', sourcedata)
,x=CASE PATINDEX('%[0-9][0-9][0-9][0-9]%', sourcedata) WHEN 0 THEN 2 ELSE 4 END
) b
CROSS APPLY
(
SELECT pos2=CASE x
WHEN 4 THEN PATINDEX('%[0-9][0-9][0-9][0-9]%', sourcedata)
ELSE PATINDEX('%[0-9][0-9]%', RIGHT(sourcedata, LEN(sourcedata)-(pos1+1))) END
) c
CROSS APPLY
(
SELECT dd=SUBSTRING(sourcedata, pos1, 2)
,yy=CASE x WHEN 4 THEN SUBSTRING(sourcedata, pos2, x)
ELSE '20' + SUBSTRING(sourcedata, 2+pos2, x)
END
) d
CROSS APPLY
(
SELECT m1=RIGHT(100+NULLIF(1*SIGN(CHARINDEX('Jan', sourcedata)), 0), 2)
,m2=RIGHT(100+NULLIF(2*SIGN(CHARINDEX('Feb', sourcedata)), 0), 2)
,m3=RIGHT(100+NULLIF(3*SIGN(CHARINDEX('Mar', sourcedata)), 0), 2)
,m4=RIGHT(100+NULLIF(4*SIGN(CHARINDEX('Apr', sourcedata)), 0), 2)
,m5=RIGHT(100+NULLIF(5*SIGN(CHARINDEX('May', sourcedata)), 0), 2)
,m6=RIGHT(100+NULLIF(6*SIGN(CHARINDEX('Jun', sourcedata)), 0), 2)
,m7=RIGHT(100+NULLIF(7*SIGN(CHARINDEX('Jul', sourcedata)), 0), 2)
,m8=RIGHT(100+NULLIF(8*SIGN(CHARINDEX('Aug', sourcedata)), 0), 2)
,m9=RIGHT(100+NULLIF(9*SIGN(CHARINDEX('Sep', sourcedata)), 0), 2)
,m10=RIGHT(100+NULLIF(10*SIGN(CHARINDEX('Oct', sourcedata)), 0), 2)
,m11=RIGHT(100+NULLIF(11*SIGN(CHARINDEX('Nov', sourcedata)), 0), 2)
,m12=RIGHT(100+NULLIF(12*SIGN(CHARINDEX('Dec', sourcedata)), 0), 2)
) e;

It would appear that you have a problem with the MM part of that forumula...

sourcedata sourcedata2

-------------------------- --------------------

12 Jan 2013 Test.docx 20130412 Test.docx

01 February 2001 File.pptx 20010401 File.pptx

Meeting 04 Feb 2012.xls Meeting 20121204.xls

09 September 2011.txt 20110409.txt

30 Jan 13.doc 20130430.doc

The following will work but only with the understanding that no other numeric digits can exist in the string as the current test data would have it.

WITH SampleData (sourcedata) AS

(

SELECT '12 Jan 2013 Test.docx'

UNION ALL SELECT '01 February 2001 File.pptx'

UNION ALL SELECT 'Meeting 04 Feb 2012.xls'

UNION ALL SELECT '09 September 2011.txt'

UNION ALL SELECT '30 Jan 13.doc'

),

cteEnumerate AS

(

SELECT RowNum = ROW_NUMBER()OVER(ORDER BY (SELECT 1)), SourceData

FROM SampleData

),

ctePosition AS

(

SELECT RowNum,SourceData,LeftPos = MIN(t.N), DateLength = MAX(t.N)-MIN(t.N)+2

FROM dbo.Tally t

CROSS JOIN cteEnumerate

WHERE t.N <= LEN(SourceData)

AND SUBSTRING(SourceData,t.N,3) LIKE '[0-9][0-9][^0-9]'

GROUP BY RowNum,SourceData

)

SELECT SourceData,

NewValue = STUFF(SourceData,LeftPos,DateLength,CONVERT(CHAR(8),CAST(SUBSTRING(SourceData,LeftPos,DateLength)AS DATETIME),112))

FROM ctePosition

;

Results:

SourceData NewValue

-------------------------- --------------------

12 Jan 2013 Test.docx 20130112 Test.docx

01 February 2001 File.pptx 20010201 File.pptx

Meeting 04 Feb 2012.xls Meeting 20120204.xls

09 September 2011.txt 20110909.txt

30 Jan 13.doc 20130130.doc

Learn about what a Tally Table is and how it replaces certain loops here...

http://www.sqlservercentral.com/articles/T-SQL/62867/

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Dwain Camps SSC Guru Points: 86908 More actions · Answer 4

Running the code you quoted Jeff, I get these results.

sourcedata sourcedata2

12 Jan 2013 Test.docx 20130112 Test.docx

01 February 2001 File.pptx 20010201 File.pptx

Meeting 04 Feb 2012.xlsMeeting 20120204.xls

09 September 2011.txt 20110909.txt

30 Jan 13.doc 20130130.doc

There were some issues in my earlier attempt. Perhaps you were testing one of those.

Edit: Noting that I'm not particularly proud of my brute force attempt. Certainly could be improved.

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

Jeff Moden SSC Guru Points: 1003896 More actions · Answer 5

dwain.c (9/29/2013)
Running the code you quoted Jeff, I get these results.
sourcedata sourcedata2
12 Jan 2013 Test.docx 20130112 Test.docx
01 February 2001 File.pptx 20010201 File.pptx
Meeting 04 Feb 2012.xlsMeeting 20120204.xls
09 September 2011.txt 20110909.txt
30 Jan 13.doc 20130130.doc
There were some issues in my earlier attempt. Perhaps you were testing one of those.
Edit: Noting that I'm not particularly proud of my brute force attempt. Certainly could be improved.

Agreed. I just reran that code and it came up with the correct answer. We must have been working simultaneously... the last thing I did was refresh the window and quote your post. As you said, I must have copied the code from the earlier post.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

dc.pc100 Mr or Mrs. 500 Points: 571 More actions · Answer 6

The following will work but only with the understanding that no other numeric digits can exist in the string as the current test data would have it.

Awesome, thanks for the in-depth assistance, works well. Though what will happen if there are other numeric values in the string such as the below?

Sorry, I should have used more meaningful example test data.

3.0 Meeting Minutes 16 January 2013.pdf

0.0 ICT Meeting (12 Jan 2013).doc

ICTAgenda 15January 2013 V1.pptx

ICTAgenda 15 January 2013 V2.pptx

Jeff Moden SSC Guru Points: 1003896 More actions · Answer 7

dc.pc100 (9/29/2013)
The following will work but only with the understanding that no other numeric digits can exist in the string as the current test data would have it.
Awesome, thanks for the in-depth assistance, works well. Though what will happen if there are other numeric values in the string such as the below?
3.0 Meeting Minutes 16 January 2013.pdf
0.0 ICT Meeting (12 Jan 2013).doc

Mine can actually handle that nuance. The difficulty will arise if you include something else that has two consecutive digits followed by a non-digit. Here's the output from my run...

SourceData NewValue

--------------------------------------- --------------------------------

12 Jan 2013 Test.docx 20130112 Test.docx

01 February 2001 File.pptx 20010201 File.pptx

Meeting 04 Feb 2012.xls Meeting 20120204.xls

09 September 2011.txt 20110909.txt

30 Jan 13.doc 20130130.doc

3.0 Meeting Minutes 16 January 2013.pdf 3.0 Meeting Minutes 20130116.pdf

0.0 ICT Meeting (12 Jan 2013).doc 0.0 ICT Meeting (20130112).doc

Dwain's code will handle the extra data you added, as well.

Before you can ask it, mine will also handle dates that are in the form of MM/DD/YYYY.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

dc.pc100 Mr or Mrs. 500 Points: 571 More actions · Answer 8

Awesome guys thank you so much. Beats manually going through 200,000 rows in multiple excel files to change the date 🙂

Jeff Moden SSC Guru Points: 1003896 More actions · Answer 9

dc.pc100 (9/29/2013)
Awesome guys thank you so much. Beats manually going through 200,000 rows in multiple excel files to change the date 🙂

Like everrything else, "It Depends". If I was allowed to drink beer while working the Excel spreadsheet, I'd being inclined to prolong such agony. 😛

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Dwain Camps SSC Guru Points: 86908 More actions · Answer 10

I find it interesting because I'm into self-flagellation and because it seems that all of the date formats you've included in your filenames CAST cleanly to a DATE data type, which can then be formatted to be stuffed back into your filename.

Here's an example using a pattern-based string splitter.

WITH SampleData (sourcedata) AS

(

SELECT '12 Jan 2013 Test.docx'

UNION ALL SELECT '01 February 2001 File.pptx'

UNION ALL SELECT 'Meeting 04 Feb 2012.xls'

UNION ALL SELECT '09 September 2011.txt'

UNION ALL SELECT '30 Jan 13.doc'

UNION ALL SELECT '3.0 Meeting Minutes 16 January 2013.pdf'

UNION ALL SELECT '0.0 ICT Meeting (12 Jan 2013).doc'

UNION ALL SELECT 'ICTAgenda 15January 2013 V1.pptx'

UNION ALL SELECT 'ICTAgenda 15 January 2013 V2.pptx'

),

SplitFNs AS

(

SELECT sourcedata, b.*

,rn=CASE WHEN LEFT(LTRIM(Item), 3) IN

('Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec')

THEN ItemNumber END

FROM SampleData a

CROSS APPLY dbo.PatternSplitCM(sourcedata, '[0-9]') b

),

DateRecs AS

(

SELECT a.sourcedata, b.ItemNumber, b.Item

FROM SplitFNs a

JOIN SplitFNs b

ON b.ItemNumber BETWEEN a.rn - 1 AND a.rn + 1 AND a.sourcedata = b.sourcedata

)

SELECT sourcedata

,sourcedata2=STUFF(sourcedata, CHARINDEX(FileDate, sourcedata), LEN(FileDate),CAST(FileDate AS DATE))

FROM SampleData a

CROSS APPLY

(

SELECT Item + ''

FROM DateRecs b

WHERE a.sourcedata = b.sourcedata

ORDER BY ItemNumber

FOR XML PATH('')

) b(FileDate);

PatternSplitCM can be found in my 4th signature link.

Note that I'm not necessarily recommending this approach. I just thought that odd ability to CAST to DATE might be easily exploitable, but alas I didn't find it so easy.

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

dc.pc100 Mr or Mrs. 500 Points: 571 More actions · Answer 11

dc.pc100

Mr or Mrs. 500

Points: 571

September 29, 2013 at 11:59 pm

#1654416

Delete

dc.pc100 Mr or Mrs. 500 Points: 571 More actions · Answer 12

How or where would i add the below

Where [SourceData] like '%[0-9]%' + 'Jan' + '[0-9]%'

as lots of data gets messed up with dwains code, and want to limit it to date specific strings only

e.g

Destination File Name....sourcedata2

20130616.xlsx..............0616.xlsx

201307.xlsx.................07.xlsx

HSE.201306.xlsx...........HSE.06.xlsx

HSE.201307.xlsx...........HSE.07.xlsx

2013 ICT Calendar.xls....ICT Calendar.xls

Inv.10017355.txt..........Optus.Inv.7355.txt

Dwain Camps SSC Guru Points: 86908 More actions · Answer 13

dc.pc100 (9/30/2013)
How or where would i add the below
Where [SourceData] like '%[0-9]%' + 'Jan' + '[0-9]%'
as lots of data gets messed up with dwains code
e.g
Destination File Name....sourcedata2
20130616.xlsx..............0616.xlsx
201307.xlsx.................07.xlsx
HSE.201306.xlsx...........HSE.06.xlsx
HSE.201307.xlsx...........HSE.07.xlsx
2013 ICT Calendar.xls....ICT Calendar.xls
Inv.10017355.txt..........Optus.Inv.7355.txt

Yup mine won't handle those.

Have you tried Jeff's code with this case?

I think using LIKE like you are trying to do is going to be a pain because you'd need 12 of them OR'd together.

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

dc.pc100 Mr or Mrs. 500 Points: 571 More actions · Answer 14

I havent had much experience with Tally tables so do no know how to utilize them.

The "OR" x12 in the where clause it fine for now. Ill probs just create another table and query the words(month names) in there

Not best practice but yea

Just having troubles inserting the where clause, as it doesn't like it anywhere near there Cross Apply