How to extract the paricular values from a string

  • Hi,

    We have the below file names

    File Name1 = 'XX_YYYYY_AA_BBB_SampleTransaction_120807_55322.TXT'

    File Name2 = 'XX_YYYYYY_DDD_MasterTransaction_120807_00005.TXT

    Query:

    Need to extract the file names as below

    File Name1 = XX_YYYYY_AA_BBB_SampleTransaction

    File Name2 = XX_YYYYYY_DDD_MasterTransaction

    The values after the first underscore from right(_55322) will be increasing from file to file, so we cant use the below query

    SELECT SUBSTRING(@FILENAME,1,(LEN(@FILENAME)-16))

    Could you please let me know the query to get the correct result?

    Regards

    SqlStud

  • You can reverse the string, look for the position of the second underscore and take the substring starting from that position till the end. Reverse the substring back and you have your result.

    edit: or don't reverse the string and look for the fourth underscore, if the first part keeps the same structure.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • Here are two ways:

    WITH Filenames (fn) AS (

    SELECT 'XX_YYYYY_AA_BBB_SampleTransaction_120807_55322.TXT'

    UNION ALL SELECT 'XX_YYYYYY_DDD_MasterTransaction_120807_00005.TXT')

    SELECT fn=SUBSTRING(fn, 1, CHARINDEX('.', fn))

    FROM Filenames;

    WITH Filenames (fn) AS (

    SELECT 'XX_YYYYY_AA_BBB_SampleTransaction_120807_55322.TXT'

    UNION ALL SELECT 'XX_YYYYYY_DDD_MasterTransaction_120807_00005.TXT')

    SELECT fn=Item

    FROM Filenames

    CROSS APPLY PatternSplitCM(fn, '[0-9a-zA-Z_]')

    WHERE [Matched] = 1 AND ItemNumber = 1;

    The second uses PatternSplitCM, which is a general tool that can be used for this purpose, and is described in the 4th article in my signature (splitting strings on patterns). That article also shows how to split out other components of a file name.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Koen Verbeeck (1/8/2013)


    You can reverse the string, look for the position of the second underscore and take the substring starting from that position till the end. Reverse the substring back and you have your result.

    edit: or don't reverse the string and look for the fourth underscore, if the first part keeps the same structure.

    Thanks SScrazy.

    Can you please send me the query?

    Regards

    SqlStud

  • dwain.c (1/8/2013)


    Here are two ways:

    WITH Filenames (fn) AS (

    SELECT 'XX_YYYYY_AA_BBB_SampleTransaction_120807_55322.TXT'

    UNION ALL SELECT 'XX_YYYYYY_DDD_MasterTransaction_120807_00005.TXT')

    SELECT fn=SUBSTRING(fn, 1, CHARINDEX('.', fn))

    FROM Filenames;

    WITH Filenames (fn) AS (

    SELECT 'XX_YYYYY_AA_BBB_SampleTransaction_120807_55322.TXT'

    UNION ALL SELECT 'XX_YYYYYY_DDD_MasterTransaction_120807_00005.TXT')

    SELECT fn=Item

    FROM Filenames

    CROSS APPLY PatternSplitCM(fn, '[0-9a-zA-Z_]')

    WHERE [Matched] = 1 AND ItemNumber = 1;

    The second uses PatternSplitCM, which is a general tool that can be used for this purpose, and is described in the 4th article in my signature (splitting strings on patterns). That article also shows how to split out other components of a file name.

    Thanks dwain.c

    using the first approach, the result is like

    XX_YYYYY_AA_BBB_SampleTransaction_120807_55322.

    XX_YYYYYY_DDD_MasterTransaction_120807_00005.

    But i need the below output

    XX_YYYYY_AA_BBB_SampleTransaction

    XX_YYYYYY_DDD_MasterTransaction

    Please suggest on this..

    Regards

    SqlStud

  • sqlstud (1/8/2013)


    dwain.c (1/8/2013)


    Here are two ways:

    WITH Filenames (fn) AS (

    SELECT 'XX_YYYYY_AA_BBB_SampleTransaction_120807_55322.TXT'

    UNION ALL SELECT 'XX_YYYYYY_DDD_MasterTransaction_120807_00005.TXT')

    SELECT fn=SUBSTRING(fn, 1, CHARINDEX('.', fn))

    FROM Filenames;

    WITH Filenames (fn) AS (

    SELECT 'XX_YYYYY_AA_BBB_SampleTransaction_120807_55322.TXT'

    UNION ALL SELECT 'XX_YYYYYY_DDD_MasterTransaction_120807_00005.TXT')

    SELECT fn=Item

    FROM Filenames

    CROSS APPLY PatternSplitCM(fn, '[0-9a-zA-Z_]')

    WHERE [Matched] = 1 AND ItemNumber = 1;

    The second uses PatternSplitCM, which is a general tool that can be used for this purpose, and is described in the 4th article in my signature (splitting strings on patterns). That article also shows how to split out other components of a file name.

    Thanks dwain.c

    using the first approach, the result is like

    XX_YYYYY_AA_BBB_SampleTransaction_120807_55322.

    XX_YYYYYY_DDD_MasterTransaction_120807_00005.

    But i need the below output

    XX_YYYYY_AA_BBB_SampleTransaction

    XX_YYYYYY_DDD_MasterTransaction

    Please suggest on this..

    Regards

    SqlStud

    Sorry about that! Misread the requirement. Try this:

    WITH Filenames (fn) AS (

    SELECT 'XX_YYYYY_AA_BBB_SampleTransaction_120807_55322.TXT'

    UNION ALL SELECT 'XX_YYYYYY_DDD_MasterTransaction_120807_00005.TXT')

    SELECT fn=SUBSTRING(fn, 1, PATINDEX('%[0-9]%', fn)-2)

    FROM Filenames;

    The PatternSplitCM will work too (not as is) but it would need to be changed into something quite a bit more complicated.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Or, even simpler:

    WITH Filenames (fn) AS (

    SELECT 'XX_YYYYY_AA_BBB_SampleTransaction_120807_55322.TXT'

    UNION ALL SELECT 'XX_YYYYYY_DDD_MasterTransaction_120807_00005.TXT')

    SELECT fn=LEFT(fn, PATINDEX('%[0-9]%', fn)-2)

    FROM Filenames;


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • dwain.c (1/8/2013)


    Or, even simpler:

    WITH Filenames (fn) AS (

    SELECT 'XX_YYYYY_AA_BBB_SampleTransaction_120807_55322.TXT'

    UNION ALL SELECT 'XX_YYYYYY_DDD_MasterTransaction_120807_00005.TXT')

    SELECT fn=LEFT(fn, PATINDEX('%[0-9]%', fn)-2)

    FROM Filenames;

    Thanks a lot Dwain.

    Its working...

    declare @a varchar(100) = 'XX_YYYYYY_DDD_MasterTransaction_120807_00005.TXT'

    select LEFT(@A, patindex('%_[0-9]%', @a)-1)

    Regards

    SqlStud

  • Koen Verbeeck (1/8/2013)


    You can reverse the string, look for the position of the second underscore and take the substring starting from that position till the end. Reverse the substring back and you have your result.

    edit: or don't reverse the string and look for the fourth underscore, if the first part keeps the same structure.

    Thanks Koen

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply