best way to extract name and number

  • I have data in a field called user_name that consistently follows this format:

    (t) Joseph Kelly -14784

    (tb) Shenna Brown -15488

    (tb) Paul Ledford -15358

    (b) Quadra Tomson -15316

    (t) Tracy Edwards -15218

    I can get the number out easily with "SUBSTRING([user_name], CHARINDEX('-',[user_name])+1, 25) as EmployeeID", e.g.: 14784, but I'm having trouble using the same or similar logic to pull out just the first name, e.g.: Joseph and the last name e.g.: Kelly, any ideas? Thanks!

  • I can get the full name with this:

    ,LTRIM(RTRIM(SUBSTRING([user_name], CHARINDEX(')', [user_name]) + 1, 
    PATINDEX('%[+-][0-9]%', [user_name]) - (CHARINDEX(')', [user_name]) + 1))))
  • Here's a way to grab first name:

    DROP table IF EXISTS #users;
    create table #users
    (
    user_name VARCHAR(255) NOT NULL PRIMARY KEY
    );

    INSERT INTO #users (user_name)
    VALUES ('(t) Joseph Kelly -14784'),
    ('(tb) Shenna Brown -15488'),
    ('(tb) Paul Ledford -15358'),
    ('(b) Quadra Tomson -15316'),
    ('(t) Tracy Edwards -15218');

    SELECT CHARINDEX(')',user_name) + 2 AS FirstNamePos,
    CHARINDEX(' ',user_name, CHARINDEX(')',user_name) + 2) AS LastNamePos,
    SUBSTRING(user_name,CHARINDEX(')',user_name) + 2,CHARINDEX(' ',user_name, CHARINDEX(')',user_name) + 2) - CHARINDEX(')',user_name)-2)
    FROM #users

  • Since you are able to get the full name which is half of your battle.  I see multiple approaches to this, none of which are "pretty".

    The most supportable approach I can think of is to store the results you have (employeeID and full name) in a temporary table and then process the first name and last name from that temporary table which should be pretty easy to do.

    Plus side to this approach is that it is easy to test and verify that your full name script is working as expected and in the future if it does break, you have a quick and easy way to test things.  Downside to the approach is that it uses tempdb which means you are writing to disk and it may (likely will) be a slower approach than doing it all in one query.

    Now, to do it in a single query, this will get you first name, last name, and employee ID:

    SELECT
    SUBSTRING( [user_name]
    , CHARINDEX('-'
    , [user_name]
    ) + 1
    , 25
    ) AS [EmployeeID]
    , SUBSTRING( SUBSTRING([user_name]
    , CHARINDEX( ' '
    , [user_name]
    ) + 1
    , 255
    )
    , 1
    , CHARINDEX( ' '
    , SUBSTRING( [user_name]
    , CHARINDEX(' '
    , [user_name]
    ) + 1
    , 255
    )
    )
    ) AS [FirstName]
    , SUBSTRING( SUBSTRING([user_name]
    , CHARINDEX( ' '
    , [user_name]
    ) + 1
    , 255
    )-- trims off the first 3 characters
    , CHARINDEX( ' '
    , SUBSTRING( [user_name]
    , CHARINDEX(' '
    , [user_name]
    ) + 1
    , 255
    )-- starts at the point where last name starts
    )
    , CHARINDEX( '-'
    , [user_name]
    ) - CHARINDEX( ' '
    , [user_name]
    , CHARINDEX( ' '
    , [user_name]
    ) + 1
    )-- need the proper length to start at
    ) AS [LastName];

    A bit messy, but should be pretty easy to follow (I hope).

    First name is just a matter of finding the first space character as the location we want to start at and keep going until the second space character.  Last name is a little harder, but similar logic, we just want to start at the second space character and go until the - character.  The length for last name is a bit more complicated to calculate, but it is the charindex  of the - in the main string minus the charindex of the second space and you are good to go.  I left some comments in the last name one so you could see my logic.

    Does the above make sense?

     

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • So... wha'cha wanna do with a name like "Jill St. John" or "James van Allen" or a name with a hyphen in it???

    Seems like there's a pretty good chance that this information is coming from an HR database somewhere and there's a really good chance your file or table was made from that.  That also means there's a really good chance they could delimit the parts of the name with something other than a hyphen or a space (a TAB character would be the best) and have ALL the columns come across with some of them being empty strings, of course.

    Better yet, if the HR database is on an SQL Server, have them send it to you as a native format BCP file and you can just import it directly instead of farting around with "Comedy Separated Values). 😀

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I wouldn't go thru all those gyrations to pull out the values.  Imagine the hassle to just add one more column to the output.

    I'd do something similar to this instead.

    SELECT ca1.*
    FROM #users
    CROSS APPLY (
    SELECT
    MAX(CASE WHEN row_num = 1 THEN Value END) AS First_Name,
    MAX(CASE WHEN row_num = 2 THEN Value END) AS Last_Name,
    MAX(CASE WHEN Value NOT LIKE '%[^0-9]%' THEN Value END) AS EmployeeID
    FROM (
    SELECT CASE WHEN Item LIKE '-%' THEN SUBSTRING(Item, 2, 100) ELSE Item END AS Value, ROW_NUMBER() OVER(ORDER BY ItemNumber) AS row_num
    FROM dbo.DelimitedSplit8K(user_name, ' ')
    WHERE Item NOT LIKE '(%'
    ) AS derived
    ) AS ca1

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • This was removed by the editor as SPAM

  • Jeff Moden wrote:

    So... wha'cha wanna do with a name like "Jill St. John" or "James van Allen" or a name with a hyphen in it???

    Seems like there's a pretty good chance that this information is coming from an HR database somewhere and there's a really good chance your file or table was made from that.  That also means there's a really good chance they could delimit the parts of the name with something other than a hyphen or a space (a TAB character would be the best) and have ALL the columns come across with some of them being empty strings, of course.

    Better yet, if the HR database is on an SQL Server, have them send it to you as a native format BCP file and you can just import it directly instead of farting around with "Comedy Separated Values). 😀

    I'd still like to know what's going to happen for the other scenarios I mention above.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • This was removed by the editor as SPAM

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply