Convert DD MMM YYYY to yyyymmdd in string

  • Hi,

    I am struggling with changing the date format of dd MMM YYYY in the sample strings below into yyymmdd

    Sample Data Input

    Column Name:

    SourceName

    12 Jan 2013 Test.docx

    01 February 2001 File.pptx

    Meeting 04 Feb 2012.xls

    09 September 2011.txt

    30 Jan 13.doc

    Sample Data Output

    Column Name:

    SourceName

    20130112 Test.docx

    20010201 File.pptx

    Meeting 20120204.xls

    20110911.txt

    20130130.doc

  • Post retracted. Hang on for a replacement.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Ok. This should work.

    But someone I'm sure will probably come up with a better way.

    WITH SampleData (sourcedata) AS

    (

    SELECT '12 Jan 2013 Test.docx'

    UNION ALL SELECT '01 February 2001 File.pptx'

    UNION ALL SELECT 'Meeting 04 Feb 2012.xls'

    UNION ALL SELECT '09 September 2011.txt'

    UNION ALL SELECT '30 Jan 13.doc'

    )

    SELECT sourcedata, sourcedata2=STUFF(sourcedata, pos1, (pos2+4)-pos1

    ,yy + COALESCE(m1, m2, m3, m4, m5, m6, m7, m8, m9, m10, m11, m12) + dd)

    FROM SampleData a

    CROSS APPLY

    (

    SELECT pos1=PATINDEX('%[0-9][0-9]%', sourcedata)

    ,x=CASE PATINDEX('%[0-9][0-9][0-9][0-9]%', sourcedata) WHEN 0 THEN 2 ELSE 4 END

    ) b

    CROSS APPLY

    (

    SELECT pos2=CASE x

    WHEN 4 THEN PATINDEX('%[0-9][0-9][0-9][0-9]%', sourcedata)

    ELSE PATINDEX('%[0-9][0-9]%', RIGHT(sourcedata, LEN(sourcedata)-(pos1+1))) END

    ) c

    CROSS APPLY

    (

    SELECT dd=SUBSTRING(sourcedata, pos1, 2)

    ,yy=CASE x WHEN 4 THEN SUBSTRING(sourcedata, pos2, x)

    ELSE '20' + SUBSTRING(sourcedata, 2+pos2, x)

    END

    ) d

    CROSS APPLY

    (

    SELECT m1=RIGHT(100+NULLIF(1*SIGN(CHARINDEX('Jan', sourcedata)), 0), 2)

    ,m2=RIGHT(100+NULLIF(2*SIGN(CHARINDEX('Feb', sourcedata)), 0), 2)

    ,m3=RIGHT(100+NULLIF(3*SIGN(CHARINDEX('Mar', sourcedata)), 0), 2)

    ,m4=RIGHT(100+NULLIF(4*SIGN(CHARINDEX('Apr', sourcedata)), 0), 2)

    ,m5=RIGHT(100+NULLIF(5*SIGN(CHARINDEX('May', sourcedata)), 0), 2)

    ,m6=RIGHT(100+NULLIF(6*SIGN(CHARINDEX('Jun', sourcedata)), 0), 2)

    ,m7=RIGHT(100+NULLIF(7*SIGN(CHARINDEX('Jul', sourcedata)), 0), 2)

    ,m8=RIGHT(100+NULLIF(8*SIGN(CHARINDEX('Aug', sourcedata)), 0), 2)

    ,m9=RIGHT(100+NULLIF(9*SIGN(CHARINDEX('Sep', sourcedata)), 0), 2)

    ,m10=RIGHT(100+NULLIF(10*SIGN(CHARINDEX('Oct', sourcedata)), 0), 2)

    ,m11=RIGHT(100+NULLIF(11*SIGN(CHARINDEX('Nov', sourcedata)), 0), 2)

    ,m12=RIGHT(100+NULLIF(12*SIGN(CHARINDEX('Dec', sourcedata)), 0), 2)

    ) e;


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • dwain.c (9/29/2013)


    Ok. This should work.

    But someone I'm sure will probably come up with a better way.

    WITH SampleData (sourcedata) AS

    (

    SELECT '12 Jan 2013 Test.docx'

    UNION ALL SELECT '01 February 2001 File.pptx'

    UNION ALL SELECT 'Meeting 04 Feb 2012.xls'

    UNION ALL SELECT '09 September 2011.txt'

    UNION ALL SELECT '30 Jan 13.doc'

    )

    SELECT sourcedata, sourcedata2=STUFF(sourcedata, pos1, (pos2+4)-pos1

    ,yy + COALESCE(m1, m2, m3, m4, m5, m6, m7, m8, m9, m10, m11, m12) + dd)

    FROM SampleData a

    CROSS APPLY

    (

    SELECT pos1=PATINDEX('%[0-9][0-9]%', sourcedata)

    ,x=CASE PATINDEX('%[0-9][0-9][0-9][0-9]%', sourcedata) WHEN 0 THEN 2 ELSE 4 END

    ) b

    CROSS APPLY

    (

    SELECT pos2=CASE x

    WHEN 4 THEN PATINDEX('%[0-9][0-9][0-9][0-9]%', sourcedata)

    ELSE PATINDEX('%[0-9][0-9]%', RIGHT(sourcedata, LEN(sourcedata)-(pos1+1))) END

    ) c

    CROSS APPLY

    (

    SELECT dd=SUBSTRING(sourcedata, pos1, 2)

    ,yy=CASE x WHEN 4 THEN SUBSTRING(sourcedata, pos2, x)

    ELSE '20' + SUBSTRING(sourcedata, 2+pos2, x)

    END

    ) d

    CROSS APPLY

    (

    SELECT m1=RIGHT(100+NULLIF(1*SIGN(CHARINDEX('Jan', sourcedata)), 0), 2)

    ,m2=RIGHT(100+NULLIF(2*SIGN(CHARINDEX('Feb', sourcedata)), 0), 2)

    ,m3=RIGHT(100+NULLIF(3*SIGN(CHARINDEX('Mar', sourcedata)), 0), 2)

    ,m4=RIGHT(100+NULLIF(4*SIGN(CHARINDEX('Apr', sourcedata)), 0), 2)

    ,m5=RIGHT(100+NULLIF(5*SIGN(CHARINDEX('May', sourcedata)), 0), 2)

    ,m6=RIGHT(100+NULLIF(6*SIGN(CHARINDEX('Jun', sourcedata)), 0), 2)

    ,m7=RIGHT(100+NULLIF(7*SIGN(CHARINDEX('Jul', sourcedata)), 0), 2)

    ,m8=RIGHT(100+NULLIF(8*SIGN(CHARINDEX('Aug', sourcedata)), 0), 2)

    ,m9=RIGHT(100+NULLIF(9*SIGN(CHARINDEX('Sep', sourcedata)), 0), 2)

    ,m10=RIGHT(100+NULLIF(10*SIGN(CHARINDEX('Oct', sourcedata)), 0), 2)

    ,m11=RIGHT(100+NULLIF(11*SIGN(CHARINDEX('Nov', sourcedata)), 0), 2)

    ,m12=RIGHT(100+NULLIF(12*SIGN(CHARINDEX('Dec', sourcedata)), 0), 2)

    ) e;

    It would appear that you have a problem with the MM part of that forumula...

    sourcedata sourcedata2

    -------------------------- --------------------

    12 Jan 2013 Test.docx 20130412 Test.docx

    01 February 2001 File.pptx 20010401 File.pptx

    Meeting 04 Feb 2012.xls Meeting 20121204.xls

    09 September 2011.txt 20110409.txt

    30 Jan 13.doc 20130430.doc

    The following will work but only with the understanding that no other numeric digits can exist in the string as the current test data would have it.

    WITH SampleData (sourcedata) AS

    (

    SELECT '12 Jan 2013 Test.docx'

    UNION ALL SELECT '01 February 2001 File.pptx'

    UNION ALL SELECT 'Meeting 04 Feb 2012.xls'

    UNION ALL SELECT '09 September 2011.txt'

    UNION ALL SELECT '30 Jan 13.doc'

    ),

    cteEnumerate AS

    (

    SELECT RowNum = ROW_NUMBER()OVER(ORDER BY (SELECT 1)), SourceData

    FROM SampleData

    ),

    ctePosition AS

    (

    SELECT RowNum,SourceData,LeftPos = MIN(t.N), DateLength = MAX(t.N)-MIN(t.N)+2

    FROM dbo.Tally t

    CROSS JOIN cteEnumerate

    WHERE t.N <= LEN(SourceData)

    AND SUBSTRING(SourceData,t.N,3) LIKE '[0-9][0-9][^0-9]'

    GROUP BY RowNum,SourceData

    )

    SELECT SourceData,

    NewValue = STUFF(SourceData,LeftPos,DateLength,CONVERT(CHAR(8),CAST(SUBSTRING(SourceData,LeftPos,DateLength)AS DATETIME),112))

    FROM ctePosition

    ;

    Results:

    SourceData NewValue

    -------------------------- --------------------

    12 Jan 2013 Test.docx 20130112 Test.docx

    01 February 2001 File.pptx 20010201 File.pptx

    Meeting 04 Feb 2012.xls Meeting 20120204.xls

    09 September 2011.txt 20110909.txt

    30 Jan 13.doc 20130130.doc

    Learn about what a Tally Table is and how it replaces certain loops here...

    http://www.sqlservercentral.com/articles/T-SQL/62867/

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Running the code you quoted Jeff, I get these results.

    sourcedata sourcedata2

    12 Jan 2013 Test.docx 20130112 Test.docx

    01 February 2001 File.pptx 20010201 File.pptx

    Meeting 04 Feb 2012.xlsMeeting 20120204.xls

    09 September 2011.txt 20110909.txt

    30 Jan 13.doc 20130130.doc

    There were some issues in my earlier attempt. Perhaps you were testing one of those.

    Edit: Noting that I'm not particularly proud of my brute force attempt. Certainly could be improved.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • dwain.c (9/29/2013)


    Running the code you quoted Jeff, I get these results.

    sourcedata sourcedata2

    12 Jan 2013 Test.docx 20130112 Test.docx

    01 February 2001 File.pptx 20010201 File.pptx

    Meeting 04 Feb 2012.xlsMeeting 20120204.xls

    09 September 2011.txt 20110909.txt

    30 Jan 13.doc 20130130.doc

    There were some issues in my earlier attempt. Perhaps you were testing one of those.

    Edit: Noting that I'm not particularly proud of my brute force attempt. Certainly could be improved.

    Agreed. I just reran that code and it came up with the correct answer. We must have been working simultaneously... the last thing I did was refresh the window and quote your post. As you said, I must have copied the code from the earlier post.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • The following will work but only with the understanding that no other numeric digits can exist in the string as the current test data would have it.

    Awesome, thanks for the in-depth assistance, works well. Though what will happen if there are other numeric values in the string such as the below?

    Sorry, I should have used more meaningful example test data.

    3.0 Meeting Minutes 16 January 2013.pdf

    0.0 ICT Meeting (12 Jan 2013).doc

    ICTAgenda 15January 2013 V1.pptx

    ICTAgenda 15 January 2013 V2.pptx

  • dc.pc100 (9/29/2013)


    The following will work but only with the understanding that no other numeric digits can exist in the string as the current test data would have it.

    Awesome, thanks for the in-depth assistance, works well. Though what will happen if there are other numeric values in the string such as the below?

    3.0 Meeting Minutes 16 January 2013.pdf

    0.0 ICT Meeting (12 Jan 2013).doc

    Mine can actually handle that nuance. The difficulty will arise if you include something else that has two consecutive digits followed by a non-digit. Here's the output from my run...

    SourceData NewValue

    --------------------------------------- --------------------------------

    12 Jan 2013 Test.docx 20130112 Test.docx

    01 February 2001 File.pptx 20010201 File.pptx

    Meeting 04 Feb 2012.xls Meeting 20120204.xls

    09 September 2011.txt 20110909.txt

    30 Jan 13.doc 20130130.doc

    3.0 Meeting Minutes 16 January 2013.pdf 3.0 Meeting Minutes 20130116.pdf

    0.0 ICT Meeting (12 Jan 2013).doc 0.0 ICT Meeting (20130112).doc

    Dwain's code will handle the extra data you added, as well.

    Before you can ask it, mine will also handle dates that are in the form of MM/DD/YYYY.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Awesome guys thank you so much. Beats manually going through 200,000 rows in multiple excel files to change the date 🙂

  • dc.pc100 (9/29/2013)


    Awesome guys thank you so much. Beats manually going through 200,000 rows in multiple excel files to change the date 🙂

    Like everrything else, "It Depends". If I was allowed to drink beer while working the Excel spreadsheet, I'd being inclined to prolong such agony. 😛

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I find it interesting because I'm into self-flagellation and because it seems that all of the date formats you've included in your filenames CAST cleanly to a DATE data type, which can then be formatted to be stuffed back into your filename.

    Here's an example using a pattern-based string splitter.

    WITH SampleData (sourcedata) AS

    (

    SELECT '12 Jan 2013 Test.docx'

    UNION ALL SELECT '01 February 2001 File.pptx'

    UNION ALL SELECT 'Meeting 04 Feb 2012.xls'

    UNION ALL SELECT '09 September 2011.txt'

    UNION ALL SELECT '30 Jan 13.doc'

    UNION ALL SELECT '3.0 Meeting Minutes 16 January 2013.pdf'

    UNION ALL SELECT '0.0 ICT Meeting (12 Jan 2013).doc'

    UNION ALL SELECT 'ICTAgenda 15January 2013 V1.pptx'

    UNION ALL SELECT 'ICTAgenda 15 January 2013 V2.pptx'

    ),

    SplitFNs AS

    (

    SELECT sourcedata, b.*

    ,rn=CASE WHEN LEFT(LTRIM(Item), 3) IN

    ('Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec')

    THEN ItemNumber END

    FROM SampleData a

    CROSS APPLY dbo.PatternSplitCM(sourcedata, '[0-9]') b

    ),

    DateRecs AS

    (

    SELECT a.sourcedata, b.ItemNumber, b.Item

    FROM SplitFNs a

    JOIN SplitFNs b

    ON b.ItemNumber BETWEEN a.rn - 1 AND a.rn + 1 AND a.sourcedata = b.sourcedata

    )

    SELECT sourcedata

    ,sourcedata2=STUFF(sourcedata, CHARINDEX(FileDate, sourcedata), LEN(FileDate),CAST(FileDate AS DATE))

    FROM SampleData a

    CROSS APPLY

    (

    SELECT Item + ''

    FROM DateRecs b

    WHERE a.sourcedata = b.sourcedata

    ORDER BY ItemNumber

    FOR XML PATH('')

    ) b(FileDate);

    PatternSplitCM can be found in my 4th signature link.

    Note that I'm not necessarily recommending this approach. I just thought that odd ability to CAST to DATE might be easily exploitable, but alas I didn't find it so easy.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Delete

  • How or where would i add the below

    Where [SourceData] like '%[0-9]%' + 'Jan' + '[0-9]%'

    as lots of data gets messed up with dwains code, and want to limit it to date specific strings only

    e.g

    Destination File Name....sourcedata2

    20130616.xlsx..............0616.xlsx

    201307.xlsx.................07.xlsx

    HSE.201306.xlsx...........HSE.06.xlsx

    HSE.201307.xlsx...........HSE.07.xlsx

    2013 ICT Calendar.xls....ICT Calendar.xls

    Inv.10017355.txt..........Optus.Inv.7355.txt

  • dc.pc100 (9/30/2013)


    How or where would i add the below

    Where [SourceData] like '%[0-9]%' + 'Jan' + '[0-9]%'

    as lots of data gets messed up with dwains code

    e.g

    Destination File Name....sourcedata2

    20130616.xlsx..............0616.xlsx

    201307.xlsx.................07.xlsx

    HSE.201306.xlsx...........HSE.06.xlsx

    HSE.201307.xlsx...........HSE.07.xlsx

    2013 ICT Calendar.xls....ICT Calendar.xls

    Inv.10017355.txt..........Optus.Inv.7355.txt

    Yup mine won't handle those.

    Have you tried Jeff's code with this case?

    I think using LIKE like you are trying to do is going to be a pain because you'd need 12 of them OR'd together.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • I havent had much experience with Tally tables so do no know how to utilize them.

    The "OR" x12 in the where clause it fine for now. Ill probs just create another table and query the words(month names) in there

    Not best practice but yea

    Just having troubles inserting the where clause, as it doesn't like it anywhere near there Cross Apply

Viewing 15 posts - 1 through 15 (of 19 total)

You must be logged in to reply to this topic. Login to reply