Substring and Charindex

  • I am using the following case statement to extract information from a long string with the '|' symbol listed multiple times in the string.

    Case

    when Len(F.Comments) is null

    then Null

    when Left(F.Comments,Charindex('|', F.Comments)-1) like 'File Name:%'

    then Substring(F.Comments,11,CHARINDEX( '|', F.Comments)-1)

    when Left(F.Comments,Charindex('|', F.Comments)-1) like 'File:%'

    then Substring(F.Comments,6,CHARINDEX( '|', F.Comments)-1)

    when (F.Comments is not Null or Left(F.Comments,Charindex('|', F.Comments,1)-1) not like 'File Name:%'

    or Left(F.Comments,Charindex('|', F.Comments)-1) not like 'File:')

    then Substring(F.Comments,1,CHARINDEX( '|', F.Comments)-1)

    End as Letter_name,

    An example of the results are listed below:

    ACCT#Letter_name

    1BACGENTPN_CF_10202011

    2BTPN_09232010 | Vendor:

    3BTPN_09232010 | Vendor:

    4 BACGENTPN_07152010 | Vendor:

    5 BACGENTPN_07152010 | Vendor:

    6BACLITTPN_11262010,LMO 0416 Cover Letter|Vendor: S

    7BACGENTPN_08312010 | Ven

    8BACGENTPN_08312010 | Ven

    9BACGENTPN_08312010 | Ven

    10BACGENTPN_08312010 | Ven

    11BACGENTPN_08312010 | Ven

    12BACGENTPN_08312010 | Ven

    13BACGENTPN_08312010 | Ven

    14 BACBKPTPN_08242010 | Vendor:

    15 BACBKPTPN_08242010 | Vendor:

    16 BACBKPTPN_09092010 | Vendor:

    17 BACBKPTPN_09092010 | Vendor:

    18BACGENTPN_09212010 | Vendor:

    19BACGENTPN_09212010 | Vendor:

    20BACGENTPN_08312010 | Ven

    21BACGENTPN_08312010 | Ven

    22BTPN_09232010 | Vendor:

    23BACBKPTPN_01272011_V2| Vendor:S

    24BACBKPTPN_01272011_V2| Vendor:S

    25BTPN_09232010 | Vendor:

    Row 1 is the ideal Letter_name I want returned. I only want to return the characters before the 1st '|'. I used substring because the words File: and File Name: were at the beginning of some of the strings. Any suggestions?

  • Try something like this - I wasn't able to test without setting up data, but this should be close:

    SELECT

    Field=F.Comments

    --find starting location: first colon + 1

    ,StartAt=CHARINDEX( ':',F.Comments) + 1

    --find ending location: first pipe - 1

    ,EndAt=CHARINDEX( '|',F.Comments) - 1

    --find number of chars to return (end location - start location)

    ,NumberOfChars=(CHARINDEX( '|',F.Comments) - 1) - (CHARINDEX( ':',F.Comments) + 1)

    --get substring: substring(field, StartAt, NumberOfChars)

    ,GetMidString=SUBSTRING(F.Comments,CHARINDEX( ':',F.Comments) + 1, (CHARINDEX( '|',F.Comments) - 1) - (CHARINDEX( ':',F.Comments)))

    ,GetString=SUBSTRING(F.Comments, 0, CHARINDEX( '|',F.Comments) - 1

    --put them together

    ,CASE

    WHEN F.Comments IS NULL THEN NULL

    WHEN F.Comments LIKE 'File%' THEN SUBSTRING(F.Comments,CHARINDEX( ':',F.Comments) + 1, (CHARINDEX( '|',F.Comments) - 1) - (CHARINDEX( ':',F.Comments)))

    ELSE SUBSTRING(F.Comments, 0, CHARINDEX( '|',F.Comments) - 1

    END

    [font="Courier New"]Looking for a Deadlock Victim Support Group..[/font]
  • The_V (10/24/2011)


    I am using the following case statement to extract information from a long string with the '|' symbol listed multiple times in the string.

    Case

    when Len(F.Comments) is null

    then Null

    when Left(F.Comments,Charindex('|', F.Comments)-1) like 'File Name:%'

    then Substring(F.Comments,11,CHARINDEX( '|', F.Comments)-1)

    when Left(F.Comments,Charindex('|', F.Comments)-1) like 'File:%'

    then Substring(F.Comments,6,CHARINDEX( '|', F.Comments)-1)

    when (F.Comments is not Null or Left(F.Comments,Charindex('|', F.Comments,1)-1) not like 'File Name:%'

    or Left(F.Comments,Charindex('|', F.Comments)-1) not like 'File:')

    then Substring(F.Comments,1,CHARINDEX( '|', F.Comments)-1)

    End as Letter_name,

    An example of the results are listed below:

    ACCT#Letter_name

    1BACGENTPN_CF_10202011

    2BTPN_09232010 | Vendor:

    3BTPN_09232010 | Vendor:

    4 BACGENTPN_07152010 | Vendor:

    5 BACGENTPN_07152010 | Vendor:

    6BACLITTPN_11262010,LMO 0416 Cover Letter|Vendor: S

    7BACGENTPN_08312010 | Ven

    8BACGENTPN_08312010 | Ven

    9BACGENTPN_08312010 | Ven

    10BACGENTPN_08312010 | Ven

    11BACGENTPN_08312010 | Ven

    12BACGENTPN_08312010 | Ven

    13BACGENTPN_08312010 | Ven

    14 BACBKPTPN_08242010 | Vendor:

    15 BACBKPTPN_08242010 | Vendor:

    16 BACBKPTPN_09092010 | Vendor:

    17 BACBKPTPN_09092010 | Vendor:

    18BACGENTPN_09212010 | Vendor:

    19BACGENTPN_09212010 | Vendor:

    20BACGENTPN_08312010 | Ven

    21BACGENTPN_08312010 | Ven

    22BTPN_09232010 | Vendor:

    23BACBKPTPN_01272011_V2| Vendor:S

    24BACBKPTPN_01272011_V2| Vendor:S

    25BTPN_09232010 | Vendor:

    Row 1 is the ideal Letter_name I want returned. I only want to return the characters before the 1st '|'. I used substring because the words File: and File Name: were at the beginning of some of the strings. Any suggestions?

    Could you post the unmodified data for the first 6 lines of results you have above? Please... if there's any private information in the data, just say "No. There's private information in the data" or if you can obfuscate the private info so that you could post the data, that would be handy. Thanks.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • No, I cannot post the unmodified data, it is private. The code and illustration details what is returned. I changed the account numbers which is not significant.

  • The_V (10/24/2011)


    I am using the following case statement to extract information from a long string with the '|' symbol listed multiple times in the string.

    Case

    when Len(F.Comments) is null

    then Null

    when Left(F.Comments,Charindex('|', F.Comments)-1) like 'File Name:%'

    then Substring(F.Comments,11,CHARINDEX( '|', F.Comments)-1)

    when Left(F.Comments,Charindex('|', F.Comments)-1) like 'File:%'

    then Substring(F.Comments,6,CHARINDEX( '|', F.Comments)-1)

    when (F.Comments is not Null or Left(F.Comments,Charindex('|', F.Comments,1)-1) not like 'File Name:%'

    or Left(F.Comments,Charindex('|', F.Comments)-1) not like 'File:')

    then Substring(F.Comments,1,CHARINDEX( '|', F.Comments)-1)

    End as Letter_name,

    An example of the results are listed below:

    ACCT#Letter_name

    1BACGENTPN_CF_10202011

    2BTPN_09232010 | Vendor:

    3BTPN_09232010 | Vendor:

    4 BACGENTPN_07152010 | Vendor:

    5 BACGENTPN_07152010 | Vendor:

    6BACLITTPN_11262010,LMO 0416 Cover Letter|Vendor: S

    7BACGENTPN_08312010 | Ven

    8BACGENTPN_08312010 | Ven

    9BACGENTPN_08312010 | Ven

    10BACGENTPN_08312010 | Ven

    11BACGENTPN_08312010 | Ven

    12BACGENTPN_08312010 | Ven

    13BACGENTPN_08312010 | Ven

    14 BACBKPTPN_08242010 | Vendor:

    15 BACBKPTPN_08242010 | Vendor:

    16 BACBKPTPN_09092010 | Vendor:

    17 BACBKPTPN_09092010 | Vendor:

    18BACGENTPN_09212010 | Vendor:

    19BACGENTPN_09212010 | Vendor:

    20BACGENTPN_08312010 | Ven

    21BACGENTPN_08312010 | Ven

    22BTPN_09232010 | Vendor:

    23BACBKPTPN_01272011_V2| Vendor:S

    24BACBKPTPN_01272011_V2| Vendor:S

    25BTPN_09232010 | Vendor:

    Row 1 is the ideal Letter_name I want returned. I only want to return the characters before the 1st '|'. I used substring because the words File: and File Name: were at the beginning of some of the strings. Any suggestions?

    Try changing

    Substring(F.Comments,11,CHARINDEX( '|', F.Comments)-1)

    to

    Substring(F.Comments,11,CHARINDEX( '|', F.Comments)-11) -- or maybe -12?

    and also change

    Substring(F.Comments,6,CHARINDEX( '|', F.Comments)-1)

    to

    Substring(F.Comments,6,CHARINDEX( '|', F.Comments)-7) -- or maybe -6?

    The reason is that the third parameter of substring is the number of characters included, not the last wanted position in the original string.

    Tom

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply