## Convert DD MMM YYYY to yyyymmdd in string

 Author Message dc.pc100 SSC Veteran Group: General Forum Members Points: 287 Visits: 9 Hi,I am struggling with changing the date format of dd MMM YYYY in the sample strings below into yyymmddSample Data InputColumn Name: SourceName12 Jan 2013 Test.docx 01 February 2001 File.pptx Meeting 04 Feb 2012.xls 09 September 2011.txt 30 Jan 13.docSample Data OutputColumn Name: SourceName20130112 Test.docx 20010201 File.pptx Meeting 20120204.xls 20110911.txt 20130130.doc dwain.c SSC-Forever Group: General Forum Members Points: 44297 Visits: 6431 Post retracted. Hang on for a replacement. My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!My thought question: Have you ever been told that your query runs too fast?My advice:INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.Need to UNPIVOT? Why not CROSS APPLY VALUES instead?Since random numbers are too important to be left to chance, let's generate some!Learn to understand recursive CTEs by example.Splitting strings based on patterns can be fast!My temporal SQL musings: Calendar Tables, an Easter SQL, Time Slots and Self-maintaining, Contiguous Effective Dates in Temporal Tables dwain.c SSC-Forever Group: General Forum Members Points: 44297 Visits: 6431 Ok. This should work.But someone I'm sure will probably come up with a better way.`WITH SampleData (sourcedata) AS( SELECT '12 Jan 2013 Test.docx' UNION ALL SELECT '01 February 2001 File.pptx' UNION ALL SELECT 'Meeting 04 Feb 2012.xls' UNION ALL SELECT '09 September 2011.txt' UNION ALL SELECT '30 Jan 13.doc')SELECT sourcedata, sourcedata2=STUFF(sourcedata, pos1, (pos2+4)-pos1 ,yy + COALESCE(m1, m2, m3, m4, m5, m6, m7, m8, m9, m10, m11, m12) + dd)FROM SampleData aCROSS APPLY( SELECT pos1=PATINDEX('%[0-9][0-9]%', sourcedata) ,x=CASE PATINDEX('%[0-9][0-9][0-9][0-9]%', sourcedata) WHEN 0 THEN 2 ELSE 4 END) bCROSS APPLY( SELECT pos2=CASE x WHEN 4 THEN PATINDEX('%[0-9][0-9][0-9][0-9]%', sourcedata) ELSE PATINDEX('%[0-9][0-9]%', RIGHT(sourcedata, LEN(sourcedata)-(pos1+1))) END) cCROSS APPLY( SELECT dd=SUBSTRING(sourcedata, pos1, 2) ,yy=CASE x WHEN 4 THEN SUBSTRING(sourcedata, pos2, x) ELSE '20' + SUBSTRING(sourcedata, 2+pos2, x) END ) dCROSS APPLY( SELECT m1=RIGHT(100+NULLIF(1*SIGN(CHARINDEX('Jan', sourcedata)), 0), 2) ,m2=RIGHT(100+NULLIF(2*SIGN(CHARINDEX('Feb', sourcedata)), 0), 2) ,m3=RIGHT(100+NULLIF(3*SIGN(CHARINDEX('Mar', sourcedata)), 0), 2) ,m4=RIGHT(100+NULLIF(4*SIGN(CHARINDEX('Apr', sourcedata)), 0), 2) ,m5=RIGHT(100+NULLIF(5*SIGN(CHARINDEX('May', sourcedata)), 0), 2) ,m6=RIGHT(100+NULLIF(6*SIGN(CHARINDEX('Jun', sourcedata)), 0), 2) ,m7=RIGHT(100+NULLIF(7*SIGN(CHARINDEX('Jul', sourcedata)), 0), 2) ,m8=RIGHT(100+NULLIF(8*SIGN(CHARINDEX('Aug', sourcedata)), 0), 2) ,m9=RIGHT(100+NULLIF(9*SIGN(CHARINDEX('Sep', sourcedata)), 0), 2) ,m10=RIGHT(100+NULLIF(10*SIGN(CHARINDEX('Oct', sourcedata)), 0), 2) ,m11=RIGHT(100+NULLIF(11*SIGN(CHARINDEX('Nov', sourcedata)), 0), 2) ,m12=RIGHT(100+NULLIF(12*SIGN(CHARINDEX('Dec', sourcedata)), 0), 2)) e;` My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!My thought question: Have you ever been told that your query runs too fast?My advice:INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.Need to UNPIVOT? Why not CROSS APPLY VALUES instead?Since random numbers are too important to be left to chance, let's generate some!Learn to understand recursive CTEs by example.Splitting strings based on patterns can be fast!My temporal SQL musings: Calendar Tables, an Easter SQL, Time Slots and Self-maintaining, Contiguous Effective Dates in Temporal Tables Jeff Moden SSC Guru Group: General Forum Members Points: 509097 Visits: 44288 dwain.c (9/29/2013)Ok. This should work.But someone I'm sure will probably come up with a better way.`WITH SampleData (sourcedata) AS( SELECT '12 Jan 2013 Test.docx' UNION ALL SELECT '01 February 2001 File.pptx' UNION ALL SELECT 'Meeting 04 Feb 2012.xls' UNION ALL SELECT '09 September 2011.txt' UNION ALL SELECT '30 Jan 13.doc')SELECT sourcedata, sourcedata2=STUFF(sourcedata, pos1, (pos2+4)-pos1 ,yy + COALESCE(m1, m2, m3, m4, m5, m6, m7, m8, m9, m10, m11, m12) + dd)FROM SampleData aCROSS APPLY( SELECT pos1=PATINDEX('%[0-9][0-9]%', sourcedata) ,x=CASE PATINDEX('%[0-9][0-9][0-9][0-9]%', sourcedata) WHEN 0 THEN 2 ELSE 4 END) bCROSS APPLY( SELECT pos2=CASE x WHEN 4 THEN PATINDEX('%[0-9][0-9][0-9][0-9]%', sourcedata) ELSE PATINDEX('%[0-9][0-9]%', RIGHT(sourcedata, LEN(sourcedata)-(pos1+1))) END) cCROSS APPLY( SELECT dd=SUBSTRING(sourcedata, pos1, 2) ,yy=CASE x WHEN 4 THEN SUBSTRING(sourcedata, pos2, x) ELSE '20' + SUBSTRING(sourcedata, 2+pos2, x) END ) dCROSS APPLY( SELECT m1=RIGHT(100+NULLIF(1*SIGN(CHARINDEX('Jan', sourcedata)), 0), 2) ,m2=RIGHT(100+NULLIF(2*SIGN(CHARINDEX('Feb', sourcedata)), 0), 2) ,m3=RIGHT(100+NULLIF(3*SIGN(CHARINDEX('Mar', sourcedata)), 0), 2) ,m4=RIGHT(100+NULLIF(4*SIGN(CHARINDEX('Apr', sourcedata)), 0), 2) ,m5=RIGHT(100+NULLIF(5*SIGN(CHARINDEX('May', sourcedata)), 0), 2) ,m6=RIGHT(100+NULLIF(6*SIGN(CHARINDEX('Jun', sourcedata)), 0), 2) ,m7=RIGHT(100+NULLIF(7*SIGN(CHARINDEX('Jul', sourcedata)), 0), 2) ,m8=RIGHT(100+NULLIF(8*SIGN(CHARINDEX('Aug', sourcedata)), 0), 2) ,m9=RIGHT(100+NULLIF(9*SIGN(CHARINDEX('Sep', sourcedata)), 0), 2) ,m10=RIGHT(100+NULLIF(10*SIGN(CHARINDEX('Oct', sourcedata)), 0), 2) ,m11=RIGHT(100+NULLIF(11*SIGN(CHARINDEX('Nov', sourcedata)), 0), 2) ,m12=RIGHT(100+NULLIF(12*SIGN(CHARINDEX('Dec', sourcedata)), 0), 2)) e;`It would appear that you have a problem with the MM part of that forumula...`sourcedata sourcedata2-------------------------- --------------------12 Jan 2013 Test.docx 20130412 Test.docx01 February 2001 File.pptx 20010401 File.pptxMeeting 04 Feb 2012.xls Meeting 20121204.xls09 September 2011.txt 20110409.txt30 Jan 13.doc 20130430.doc`The following will work but only with the understanding that no other numeric digits can exist in the string as the current test data would have it. `WITH SampleData (sourcedata) AS( SELECT '12 Jan 2013 Test.docx' UNION ALL SELECT '01 February 2001 File.pptx' UNION ALL SELECT 'Meeting 04 Feb 2012.xls' UNION ALL SELECT '09 September 2011.txt' UNION ALL SELECT '30 Jan 13.doc'),cteEnumerate AS( SELECT RowNum = ROW_NUMBER()OVER(ORDER BY (SELECT 1)), SourceData FROM SampleData),ctePosition AS( SELECT RowNum,SourceData,LeftPos = MIN(t.N), DateLength = MAX(t.N)-MIN(t.N)+2 FROM dbo.Tally t CROSS JOIN cteEnumerate WHERE t.N <= LEN(SourceData) AND SUBSTRING(SourceData,t.N,3) LIKE '[0-9][0-9][^0-9]' GROUP BY RowNum,SourceData) SELECT SourceData, NewValue = STUFF(SourceData,LeftPos,DateLength,CONVERT(CHAR(8),CAST(SUBSTRING(SourceData,LeftPos,DateLength)AS DATETIME),112)) FROM ctePosition;`Results:`SourceData NewValue-------------------------- --------------------12 Jan 2013 Test.docx 20130112 Test.docx01 February 2001 File.pptx 20010201 File.pptxMeeting 04 Feb 2012.xls Meeting 20120204.xls09 September 2011.txt 20110909.txt30 Jan 13.doc 20130130.doc`Learn about what a Tally Table is and how it replaces certain loops here...http://www.sqlservercentral.com/articles/T-SQL/62867/ --Jeff ModenRBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.First step towards the paradigm shift of writing Set Based code: Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair Helpful Links:How to post code problemsHow to post performance problemsForum FAQs dwain.c SSC-Forever Group: General Forum Members Points: 44297 Visits: 6431 Running the code you quoted Jeff, I get these results.`sourcedata sourcedata212 Jan 2013 Test.docx 20130112 Test.docx01 February 2001 File.pptx 20010201 File.pptxMeeting 04 Feb 2012.xls Meeting 20120204.xls09 September 2011.txt 20110909.txt30 Jan 13.doc 20130130.doc`There were some issues in my earlier attempt. Perhaps you were testing one of those.Edit: Noting that I'm not particularly proud of my brute force attempt. Certainly could be improved. My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!My thought question: Have you ever been told that your query runs too fast?My advice:INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.Need to UNPIVOT? Why not CROSS APPLY VALUES instead?Since random numbers are too important to be left to chance, let's generate some!Learn to understand recursive CTEs by example.Splitting strings based on patterns can be fast!My temporal SQL musings: Calendar Tables, an Easter SQL, Time Slots and Self-maintaining, Contiguous Effective Dates in Temporal Tables Jeff Moden SSC Guru Group: General Forum Members Points: 509097 Visits: 44288 dwain.c (9/29/2013)Running the code you quoted Jeff, I get these results.`sourcedata sourcedata212 Jan 2013 Test.docx 20130112 Test.docx01 February 2001 File.pptx 20010201 File.pptxMeeting 04 Feb 2012.xls Meeting 20120204.xls09 September 2011.txt 20110909.txt30 Jan 13.doc 20130130.doc`There were some issues in my earlier attempt. Perhaps you were testing one of those.Edit: Noting that I'm not particularly proud of my brute force attempt. Certainly could be improved.Agreed. I just reran that code and it came up with the correct answer. We must have been working simultaneously... the last thing I did was refresh the window and quote your post. As you said, I must have copied the code from the earlier post. --Jeff ModenRBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.First step towards the paradigm shift of writing Set Based code: Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair Helpful Links:How to post code problemsHow to post performance problemsForum FAQs dc.pc100 SSC Veteran Group: General Forum Members Points: 287 Visits: 9 The following will work but only with the understanding that no other numeric digits can exist in the string as the current test data would have it. Awesome, thanks for the in-depth assistance, works well. Though what will happen if there are other numeric values in the string such as the below?Sorry, I should have used more meaningful example test data.3.0 Meeting Minutes 16 January 2013.pdf0.0 ICT Meeting (12 Jan 2013).docICTAgenda 15January 2013 V1.pptxICTAgenda 15 January 2013 V2.pptx Jeff Moden SSC Guru Group: General Forum Members Points: 509097 Visits: 44288 dc.pc100 (9/29/2013)The following will work but only with the understanding that no other numeric digits can exist in the string as the current test data would have it. Awesome, thanks for the in-depth assistance, works well. Though what will happen if there are other numeric values in the string such as the below?3.0 Meeting Minutes 16 January 2013.pdf0.0 ICT Meeting (12 Jan 2013).docMine can actually handle that nuance. The difficulty will arise if you include something else that has two consecutive digits followed by a non-digit. Here's the output from my run...`SourceData NewValue--------------------------------------- --------------------------------12 Jan 2013 Test.docx 20130112 Test.docx01 February 2001 File.pptx 20010201 File.pptxMeeting 04 Feb 2012.xls Meeting 20120204.xls09 September 2011.txt 20110909.txt30 Jan 13.doc 20130130.doc3.0 Meeting Minutes 16 January 2013.pdf 3.0 Meeting Minutes 20130116.pdf0.0 ICT Meeting (12 Jan 2013).doc 0.0 ICT Meeting (20130112).doc`Dwain's code will handle the extra data you added, as well.Before you can ask it, mine will also handle dates that are in the form of MM/DD/YYYY. --Jeff ModenRBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.First step towards the paradigm shift of writing Set Based code: Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair Helpful Links:How to post code problemsHow to post performance problemsForum FAQs dc.pc100 SSC Veteran Group: General Forum Members Points: 287 Visits: 9 Awesome guys thank you so much. Beats manually going through 200,000 rows in multiple excel files to change the date Jeff Moden SSC Guru Group: General Forum Members Points: 509097 Visits: 44288 dc.pc100 (9/29/2013)Awesome guys thank you so much. Beats manually going through 200,000 rows in multiple excel files to change the date Like everrything else, "It Depends". If I was allowed to drink beer while working the Excel spreadsheet, I'd being inclined to prolong such agony. :-P --Jeff ModenRBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.First step towards the paradigm shift of writing Set Based code: Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair Helpful Links:How to post code problemsHow to post performance problemsForum FAQs