Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 12»»

Convert DD MMM YYYY to yyyymmdd in string Expand / Collapse
Author
Message
Posted Sunday, September 29, 2013 6:21 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Monday, September 30, 2013 4:18 PM
Points: 7, Visits: 9
Hi,

I am struggling with changing the date format of dd MMM YYYY in the sample strings below into yyymmdd

Sample Data Input
Column Name:
SourceName
12 Jan 2013 Test.docx
01 February 2001 File.pptx
Meeting 04 Feb 2012.xls
09 September 2011.txt
30 Jan 13.doc


Sample Data Output
Column Name:
SourceName
20130112 Test.docx
20010201 File.pptx
Meeting 20120204.xls
20110911.txt
20130130.doc
Post #1499806
Posted Sunday, September 29, 2013 7:10 PM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Today @ 3:30 AM
Points: 3,420, Visits: 5,348
Post retracted. Hang on for a replacement.


My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!

My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.


Need to UNPIVOT? Why not CROSS APPLY VALUES instead?
Since random numbers are too important to be left to chance, let's generate some!
Learn to understand recursive CTEs by example.
Splitting strings based on patterns can be fast!
Post #1499815
Posted Sunday, September 29, 2013 7:16 PM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Today @ 3:30 AM
Points: 3,420, Visits: 5,348
Ok. This should work.

But someone I'm sure will probably come up with a better way.

WITH SampleData (sourcedata) AS
(
SELECT '12 Jan 2013 Test.docx'
UNION ALL SELECT '01 February 2001 File.pptx'
UNION ALL SELECT 'Meeting 04 Feb 2012.xls'
UNION ALL SELECT '09 September 2011.txt'
UNION ALL SELECT '30 Jan 13.doc'
)
SELECT sourcedata, sourcedata2=STUFF(sourcedata, pos1, (pos2+4)-pos1
,yy + COALESCE(m1, m2, m3, m4, m5, m6, m7, m8, m9, m10, m11, m12) + dd)
FROM SampleData a
CROSS APPLY
(
SELECT pos1=PATINDEX('%[0-9][0-9]%', sourcedata)
,x=CASE PATINDEX('%[0-9][0-9][0-9][0-9]%', sourcedata) WHEN 0 THEN 2 ELSE 4 END
) b
CROSS APPLY
(
SELECT pos2=CASE x
WHEN 4 THEN PATINDEX('%[0-9][0-9][0-9][0-9]%', sourcedata)
ELSE PATINDEX('%[0-9][0-9]%', RIGHT(sourcedata, LEN(sourcedata)-(pos1+1))) END
) c
CROSS APPLY
(
SELECT dd=SUBSTRING(sourcedata, pos1, 2)
,yy=CASE x WHEN 4 THEN SUBSTRING(sourcedata, pos2, x)
ELSE '20' + SUBSTRING(sourcedata, 2+pos2, x)
END
) d
CROSS APPLY
(
SELECT m1=RIGHT(100+NULLIF(1*SIGN(CHARINDEX('Jan', sourcedata)), 0), 2)
,m2=RIGHT(100+NULLIF(2*SIGN(CHARINDEX('Feb', sourcedata)), 0), 2)
,m3=RIGHT(100+NULLIF(3*SIGN(CHARINDEX('Mar', sourcedata)), 0), 2)
,m4=RIGHT(100+NULLIF(4*SIGN(CHARINDEX('Apr', sourcedata)), 0), 2)
,m5=RIGHT(100+NULLIF(5*SIGN(CHARINDEX('May', sourcedata)), 0), 2)
,m6=RIGHT(100+NULLIF(6*SIGN(CHARINDEX('Jun', sourcedata)), 0), 2)
,m7=RIGHT(100+NULLIF(7*SIGN(CHARINDEX('Jul', sourcedata)), 0), 2)
,m8=RIGHT(100+NULLIF(8*SIGN(CHARINDEX('Aug', sourcedata)), 0), 2)
,m9=RIGHT(100+NULLIF(9*SIGN(CHARINDEX('Sep', sourcedata)), 0), 2)
,m10=RIGHT(100+NULLIF(10*SIGN(CHARINDEX('Oct', sourcedata)), 0), 2)
,m11=RIGHT(100+NULLIF(11*SIGN(CHARINDEX('Nov', sourcedata)), 0), 2)
,m12=RIGHT(100+NULLIF(12*SIGN(CHARINDEX('Dec', sourcedata)), 0), 2)
) e;





My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!

My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.


Need to UNPIVOT? Why not CROSS APPLY VALUES instead?
Since random numbers are too important to be left to chance, let's generate some!
Learn to understand recursive CTEs by example.
Splitting strings based on patterns can be fast!
Post #1499817
Posted Sunday, September 29, 2013 8:14 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 7:02 AM
Points: 35,263, Visits: 31,752
dwain.c (9/29/2013)
Ok. This should work.

But someone I'm sure will probably come up with a better way.

WITH SampleData (sourcedata) AS
(
SELECT '12 Jan 2013 Test.docx'
UNION ALL SELECT '01 February 2001 File.pptx'
UNION ALL SELECT 'Meeting 04 Feb 2012.xls'
UNION ALL SELECT '09 September 2011.txt'
UNION ALL SELECT '30 Jan 13.doc'
)
SELECT sourcedata, sourcedata2=STUFF(sourcedata, pos1, (pos2+4)-pos1
,yy + COALESCE(m1, m2, m3, m4, m5, m6, m7, m8, m9, m10, m11, m12) + dd)
FROM SampleData a
CROSS APPLY
(
SELECT pos1=PATINDEX('%[0-9][0-9]%', sourcedata)
,x=CASE PATINDEX('%[0-9][0-9][0-9][0-9]%', sourcedata) WHEN 0 THEN 2 ELSE 4 END
) b
CROSS APPLY
(
SELECT pos2=CASE x
WHEN 4 THEN PATINDEX('%[0-9][0-9][0-9][0-9]%', sourcedata)
ELSE PATINDEX('%[0-9][0-9]%', RIGHT(sourcedata, LEN(sourcedata)-(pos1+1))) END
) c
CROSS APPLY
(
SELECT dd=SUBSTRING(sourcedata, pos1, 2)
,yy=CASE x WHEN 4 THEN SUBSTRING(sourcedata, pos2, x)
ELSE '20' + SUBSTRING(sourcedata, 2+pos2, x)
END
) d
CROSS APPLY
(
SELECT m1=RIGHT(100+NULLIF(1*SIGN(CHARINDEX('Jan', sourcedata)), 0), 2)
,m2=RIGHT(100+NULLIF(2*SIGN(CHARINDEX('Feb', sourcedata)), 0), 2)
,m3=RIGHT(100+NULLIF(3*SIGN(CHARINDEX('Mar', sourcedata)), 0), 2)
,m4=RIGHT(100+NULLIF(4*SIGN(CHARINDEX('Apr', sourcedata)), 0), 2)
,m5=RIGHT(100+NULLIF(5*SIGN(CHARINDEX('May', sourcedata)), 0), 2)
,m6=RIGHT(100+NULLIF(6*SIGN(CHARINDEX('Jun', sourcedata)), 0), 2)
,m7=RIGHT(100+NULLIF(7*SIGN(CHARINDEX('Jul', sourcedata)), 0), 2)
,m8=RIGHT(100+NULLIF(8*SIGN(CHARINDEX('Aug', sourcedata)), 0), 2)
,m9=RIGHT(100+NULLIF(9*SIGN(CHARINDEX('Sep', sourcedata)), 0), 2)
,m10=RIGHT(100+NULLIF(10*SIGN(CHARINDEX('Oct', sourcedata)), 0), 2)
,m11=RIGHT(100+NULLIF(11*SIGN(CHARINDEX('Nov', sourcedata)), 0), 2)
,m12=RIGHT(100+NULLIF(12*SIGN(CHARINDEX('Dec', sourcedata)), 0), 2)
) e;




It would appear that you have a problem with the MM part of that forumula...

sourcedata                 sourcedata2
-------------------------- --------------------
12 Jan 2013 Test.docx 20130412 Test.docx
01 February 2001 File.pptx 20010401 File.pptx
Meeting 04 Feb 2012.xls Meeting 20121204.xls
09 September 2011.txt 20110409.txt
30 Jan 13.doc 20130430.doc



The following will work but only with the understanding that no other numeric digits can exist in the string as the current test data would have it.

WITH SampleData (sourcedata) AS
(
SELECT '12 Jan 2013 Test.docx'
UNION ALL SELECT '01 February 2001 File.pptx'
UNION ALL SELECT 'Meeting 04 Feb 2012.xls'
UNION ALL SELECT '09 September 2011.txt'
UNION ALL SELECT '30 Jan 13.doc'
),
cteEnumerate AS
(
SELECT RowNum = ROW_NUMBER()OVER(ORDER BY (SELECT 1)), SourceData
FROM SampleData
),
ctePosition AS
(
SELECT RowNum,SourceData,LeftPos = MIN(t.N), DateLength = MAX(t.N)-MIN(t.N)+2
FROM dbo.Tally t
CROSS JOIN cteEnumerate
WHERE t.N <= LEN(SourceData)
AND SUBSTRING(SourceData,t.N,3) LIKE '[0-9][0-9][^0-9]'
GROUP BY RowNum,SourceData
)
SELECT SourceData,
NewValue = STUFF(SourceData,LeftPos,DateLength,CONVERT(CHAR(8),CAST(SUBSTRING(SourceData,LeftPos,DateLength)AS DATETIME),112))
FROM ctePosition
;


Results:
SourceData                 NewValue
-------------------------- --------------------
12 Jan 2013 Test.docx 20130112 Test.docx
01 February 2001 File.pptx 20010201 File.pptx
Meeting 04 Feb 2012.xls Meeting 20120204.xls
09 September 2011.txt 20110909.txt
30 Jan 13.doc 20130130.doc



Learn about what a Tally Table is and how it replaces certain loops here...
http://www.sqlservercentral.com/articles/T-SQL/62867/


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1499821
Posted Sunday, September 29, 2013 8:28 PM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Today @ 3:30 AM
Points: 3,420, Visits: 5,348
Running the code you quoted Jeff, I get these results.

sourcedata                       sourcedata2
12 Jan 2013 Test.docx 20130112 Test.docx
01 February 2001 File.pptx 20010201 File.pptx
Meeting 04 Feb 2012.xls Meeting 20120204.xls
09 September 2011.txt 20110909.txt
30 Jan 13.doc 20130130.doc



There were some issues in my earlier attempt. Perhaps you were testing one of those.

Edit: Noting that I'm not particularly proud of my brute force attempt. Certainly could be improved.



My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!

My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.


Need to UNPIVOT? Why not CROSS APPLY VALUES instead?
Since random numbers are too important to be left to chance, let's generate some!
Learn to understand recursive CTEs by example.
Splitting strings based on patterns can be fast!
Post #1499823
Posted Sunday, September 29, 2013 8:41 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 7:02 AM
Points: 35,263, Visits: 31,752
dwain.c (9/29/2013)
Running the code you quoted Jeff, I get these results.

sourcedata                       sourcedata2
12 Jan 2013 Test.docx 20130112 Test.docx
01 February 2001 File.pptx 20010201 File.pptx
Meeting 04 Feb 2012.xls Meeting 20120204.xls
09 September 2011.txt 20110909.txt
30 Jan 13.doc 20130130.doc



There were some issues in my earlier attempt. Perhaps you were testing one of those.

Edit: Noting that I'm not particularly proud of my brute force attempt. Certainly could be improved.


Agreed. I just reran that code and it came up with the correct answer. We must have been working simultaneously... the last thing I did was refresh the window and quote your post. As you said, I must have copied the code from the earlier post.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1499824
Posted Sunday, September 29, 2013 9:12 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Monday, September 30, 2013 4:18 PM
Points: 7, Visits: 9

The following will work but only with the understanding that no other numeric digits can exist in the string as the current test data would have it.


Awesome, thanks for the in-depth assistance, works well. Though what will happen if there are other numeric values in the string such as the below?

Sorry, I should have used more meaningful example test data.

3.0 Meeting Minutes 16 January 2013.pdf
0.0 ICT Meeting (12 Jan 2013).doc
ICTAgenda 15January 2013 V1.pptx
ICTAgenda 15 January 2013 V2.pptx
Post #1499825
Posted Sunday, September 29, 2013 9:19 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 7:02 AM
Points: 35,263, Visits: 31,752
dc.pc100 (9/29/2013)

The following will work but only with the understanding that no other numeric digits can exist in the string as the current test data would have it.


Awesome, thanks for the in-depth assistance, works well. Though what will happen if there are other numeric values in the string such as the below?

3.0 Meeting Minutes 16 January 2013.pdf
0.0 ICT Meeting (12 Jan 2013).doc



Mine can actually handle that nuance. The difficulty will arise if you include something else that has two consecutive digits followed by a non-digit. Here's the output from my run...
SourceData                              NewValue
--------------------------------------- --------------------------------
12 Jan 2013 Test.docx 20130112 Test.docx
01 February 2001 File.pptx 20010201 File.pptx
Meeting 04 Feb 2012.xls Meeting 20120204.xls
09 September 2011.txt 20110909.txt
30 Jan 13.doc 20130130.doc
3.0 Meeting Minutes 16 January 2013.pdf 3.0 Meeting Minutes 20130116.pdf
0.0 ICT Meeting (12 Jan 2013).doc 0.0 ICT Meeting (20130112).doc



Dwain's code will handle the extra data you added, as well.

Before you can ask it, mine will also handle dates that are in the form of MM/DD/YYYY.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1499826
Posted Sunday, September 29, 2013 9:44 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Monday, September 30, 2013 4:18 PM
Points: 7, Visits: 9
Awesome guys thank you so much. Beats manually going through 200,000 rows in multiple excel files to change the date :)
Post #1499827
Posted Sunday, September 29, 2013 9:53 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 7:02 AM
Points: 35,263, Visits: 31,752
dc.pc100 (9/29/2013)
Awesome guys thank you so much. Beats manually going through 200,000 rows in multiple excel files to change the date :)


Like everrything else, "It Depends". If I was allowed to drink beer while working the Excel spreadsheet, I'd being inclined to prolong such agony.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1499828
« Prev Topic | Next Topic »

Add to briefcase 12»»

Permissions Expand / Collapse