Parse strings to get select output

  • Hi i have a field with following value :

    USE TESTTABLE

    GO

    CREATE TABLE Example (

    Description varchar(500))

    GO

    INSERT INTO [TESTTABLE].[dbo].[Example] ([Description])

    VALUES ('hello This is a test query. Thank you.

    Please be advised of the following :

    Name: Doe,John

    ID: 123456

    Preferred Name: None

    Type: My type

    Class: My Class

    Last Day of Year: 31-Dec-2009

    Domain: MYDomain

    Username: My Username

    Email: my@email.com

    Contact: Jane,Mary')

    GO

    INSERT INTO [TESTTABLE].[dbo].[Example] ([Description])

    VALUES ('hello This is a test query. Thank you.

    Please be advised of the following :

    Name:Doe, John

    ID: 123456

    Preferred Name: None

    Type: My type

    Class: My Class

    Last Day of Year: 31-Dec-2009

    Domain: MYDomain

    Username: My Username

    Email: my@email.com

    Contact: Jane,Mary')

    GO

    INSERT INTO [TESTTABLE].[dbo].[Example] ([Description])

    VALUES ('hello This is a test query. Thank you.

    Please be advised of the following :

    Name: Doe,John

    ID: 123456

    Preferred Name: None

    Type: My type

    Class: My Class

    Last Day of Year: 31-Dec-2009

    Domain: MYDomain

    Username: My Username

    Email: my@email.com

    Contact: Jane,Mary')

    GO

    I want to extract three things from the above data stored ina single field

    1. LAST NAME (All characters before COMMA till a special character is met)

    2. FIRST NAME (All characters AFTER COMMA till a special character is met)

    2. ID (All numbers after the word ID. Ignore all special characters before and after the numbers)

    The Output will be :

    FIRST NAME LAST NAME ID

    John Doe 123456

    John Doe 123456

    John Doe 123456

    Please help how can I use substrings to extract this data in SQL 2005

  • This should do it... the double-split on name required an extra step. Doing it in steps like this is actually a bit faster than trying to do it all in one big formula.

    WITH

    cteSplit1 AS

    (--==== Split out the begining of the two basic parts

    SELECT LTRIM(SUBSTRING(Description,CHARINDEX('Name:',Description)+5,8000)) AS NamePart,

    LTRIM(SUBSTRING(Description,CHARINDEX('ID:' ,Description)+3,8000)) AS IDPart

    FROM dbo.Example

    )

    ,

    cteSplit2 AS

    (--==== Continue splitting to finer resolution

    SELECT LTRIM(SUBSTRING(NamePart,CHARINDEX(',',NamePart)+1,8000)) AS FirstPart,

    SUBSTRING(NamePart,1,CHARINDEX(',',NamePart)-1) AS LastName,

    SUBSTRING(IDPart,1,PATINDEX('%[^0-9]%',IDPart)-1) AS ID

    FROM cteSplit1

    )

    --===== Final split on name to get the first name and show all the rest

    SELECT SUBSTRING(FirstPart,1,CHARINDEX(CHAR(13),FirstPart)-1) AS FirstName,

    LastName,

    ID

    FROM cteSplit2

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • This can also be done in .NET using regular expressions...

  • Thanks for your help guys.

    It worked great. I'm so happy :):-D

  • In fact there is a small problem.I get the error message

    Invalid length parameter passed to the substring function.

    I have an idea that it couldbe happenig because of null/ negative values but I'm not able to incorporate nullifor such statement into it.

    If I can get some help,tat wud be great.

  • It would be a help if you could post the data that caused the problem. Thanks.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • This is the exact data with minor modifications (The entire text goes into a single field and i need to extract first name, last name, middle name if present, and ID):

    HelloWorld - My Email

    This message may contain old / new information. If you are not the addressee or authorized to receive this for the addressee, you must not use, copy, disclose, or take any action based on this message or any information herein. If you have received this message in error, please advise the sender immediately by reply e-mail and delete this message. Thank you for your cooperation.

    Please be advised of the following information:

    Name: Doe,John

    ID: 123456

    Preferred Name: Doe,John

    Type: Mytype

    Mytype / yourtype: MyType

    Class: Myclass

    First Day of Month: 01-Dec-2009

    Last Day : 31-Dec-2009

    Address: 355 - Max street

    Post Bag: 2053 - Mypostbag

    Office: 2010 - Office - Myoffice

    Location: My Location new

    Friend's Name: Jane,Mary

    Friend's Location: friendloc

    Region: XY

    alaisname: jonny

    Email: john@doe.com

    Emergency Contact: Hunt,Ethan

  • I can't get it to fail with any of the data you've provided. Then only thing that I can think of is...

    1) Maybe you're providing the wrong line of data as a failure. Do all of the lines have both a NAME and ID "field"? If they don't, that will cause a failure that we'll need to fix with ISNULL.

    2) Maybe there's a different special character at the end of each "field". My code is using CHAR(13) (ie:Carriage Return). Perhaps the actual data is using CHAR(10) (ie:New Line or Line Feed) or maybe it's using some combination of CHAR(13) and CHAR(10). For that matter, maybe it's using a "Vertical Tab" (CHAR(11) if memory serves correctly). You need to find out what the special character(s) is(are) between the fields for this to work correctly.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply