splitting 1 field into multiple fields

  • I'm quite new to SQL and have found myself into a bit of a conundrum..

    I have a table where the names look like this:

     

    Mr E G Ford

    Mr Trevor Peacock

    Mr Jones

    Mr B Racoon.

     

    I'm trying to get them into title,firstname and surname fields.  Best thing I could come up with is using parsename and replacing the " " with a . Of course that would cause havoc on the firstname/initials.

    Could anyone help?

     

  • Try something with this: http://www.sqlservercentral.com/columnists/sjones/tamestrings2.asp

     

  • Won't this still give me the same problems in the initials/forename field.  So I end up with something like:

     

    MR

    E

    G

    Ford

    Mr

    Trevor

    Peacock.

     

     

  • You need to decide what your business rules dictate should happen. For instance, with Mr E G Ford, what do you want to end up with in firstname? 'E', and throw the 'G' away? 'E G' ? What about Mr Jones, what does he get in firstname, ''? NULL?

    Either alone or in consultation with whoever will be consuming the data, put together a definitive set of rules about what happens in every possible input case. Then you can code.

    The tools you will be using for the coding are CHARINDEX (to find one string within another), SUBSTRING (to get part of a string), LEFT and RIGHT (which aren't necessary since SUBSTRING can do what they do, but they make things clearer), LEN (to get the length of a string), REVERSE (to reverse a string).

    For example, to get the first 'word' of your input string (for your title field, say), you could use the expression

    LEFT(inputname, CHARINDEX(' ', inputname)-1)

    To extract the second 'word' of the input string, you could say

    SUBSTRING(inputname, 
    CHARINDEX(' ', inputname)+1, 
    CHARINDEX(' ', inputname, CHARINDEX(' ', inputname)+1) - CHARINDEX(' ', indexname) - 1)

    Here we want the substring that starts at (location of first space, plus one) and has length (location of second space, less location of first space, less one). The location of the second space is the location of the first space AFTER the first space (hope that makes sense). For example, for

    Mr E G Jones
    123456789012
    we have
    CHARINDEX(' ', inputname) = 3
    CHARINDEX(' ', inputname, 3+1) = 5

    so we would extract

    SUBSTRING(inputname, 3+1, 5-3-1) = 
    SUBSTRING(inputname, 4, 1) = 
    'E'

    To extract the *last* 'word' of inputname (for the surname) we have to be a little sneaky since T-SQL lacks a 'reverse search' function (like VB's InStrRev, say). We use the fact that the last space in inputname is the first space in REVERSE(inputname), so we end up with

    RIGHT(inputname, CHARINDEX(' ', REVERSE(inputname))-1)

    Notice how using RIGHT rather then SUBSTRING saves us having to work out the length of inputname.

    We could use the same technique if we decided that 'firstname' was to be filled with everything between the first and last spaces of inputname:

    SUBSTRING(
    inputname, 
    CHARINDEX(' ', inputname), 
    LEN(inputname)-CHARINDEX(' ', REVERSE(inputname)+1-CHARINDEX(' ', inputname))

     

    This has been just a taster of the wonderful world of string manipulation; I have neglected to include any kind of error checking, which you would have to do with real world data, probably in a preprocessing query. But hopefully you will be able to pick up some ideas.

     

  • That works a treat.  I've put a bit of error checking around the fullname and then what comes out the other end and it's all hunky dorey.

     

    Thanks a lot.

     

     

  • Just a comment on the subject of name parsing.  The last word in the string is not necessarily the surname.  For example, the surname "Saint Raymond" is 2 words.  Also, the first word is not necessarily the entire first name.  My friends "first name" is "Mary Dee".  As someone who constantly gets sorted into the wrong column based on my 2 word last name, I am sensitive to this problem.  This is an issue of data collection and business rules, not a programming issue, but I wanted you to be aware of it.



    Terri

    To speak algebraically, Mr. M. is execrable, but Mr. C. is
    (x+1)-ecrable.
    Edgar Allan Poe
    [Discussing fellow writers Cornelius Mathews and William Ellery Channing.]

  • I completley agree with you, it is a data collection issue.  I've always prefered to keep my names into separate fields for this very reason.  Hence the fact I was doing the above. 

     

     

  • try this...

    CREATE procedure dbo.xxx

    @vchName varchar (50),

    @vchPrefix varchar(50) output,

    @vchFirstName  varchar(50) output,

    @vchMiddleName  varchar(50) output,

    @vchLastName  varchar(50) output,

    @vchSuffix  varchar(50) output

    as

    set nocount on

    declare @tiStringLength  tinyint,

     @tiSpaceLocator  tinyint,

     @tiPointer  tinyint,

     @tiCount  tinyint,

     @tiRecCount  tinyint,

     @vchStrHold  varchar(50),

     @vchOne  varchar(50),

     @vchTwo   varchar(50),

     @vchThree   varchar(50),

     @vchFour   varchar(50),

     @vchFive   varchar(50)

    select @tiRecCount = 0

    select @vchName = @vchName + ' '

    select @vchName = ltrim(@vchName)

    select @tiCount = 1

    WHILE @tiCount < 6

    BEGIN

     select @tiStringLength = datalength(@vchName)

     select @tiSpaceLocator = patindex("% %",@vchName)

     IF @tiSpaceLocator > 0

     BEGIN

      select @tiRecCount = @tiRecCount + 1

      select @vchStrHold = LEFT(@vchName, @tiSpaceLocator)

      IF @tiCount = 1 

       select @vchOne = @vchStrHold

      IF @tiCount = 2

       select @vchTwo = @vchStrHold

      IF @tiCount = 3 

       select @vchThree = @vchStrHold

      IF @tiCount = 4 

       select @vchFour = @vchStrHold

      IF @tiCount = 5 

       select @vchFive = @vchStrHold

     END

     ELSE

     BEGIN

      select @vchStrHold = null

      IF @tiCount = 1 

       select @vchOne = @vchStrHold

      IF @tiCount = 2

       select @vchTwo = @vchStrHold

      IF @tiCount = 3 

       select @vchThree = @vchStrHold

      IF @tiCount = 4 

       select @vchFour = @vchStrHold

      IF @tiCount = 5 

       select @vchFive = @vchStrHold

     END 

     select @tiPointer = @tiStringLength - @tiSpaceLocator 

     select @vchName = right(@vchName,@tiPointer)

     select @vchName = ltrim(@vchName)

     if @vchStrHold = ''

      break

     select @tiCount = @tiCount + 1

    END

    IF @tiRecCount = 1 --put value in last name

     select @vchLastName = @vchOne

    IF @tiRecCount = 2

     IF @vchOne like ('Mr%') or @vchOne like ('Dr%') or @vchOne like ('Ms%')

     BEGIN

      select @vchPrefix  = @vchOne

      select @vchLastName  = @vchTwo

     END

     ELSE

     BEGIN

      select @vchFirstName  = @vchOne

      select @vchLastName   = @vchTwo

     END

    IF @tiRecCount = 3

    BEGIN

     IF @vchOne like ('Mr%') or @vchOne like ('Dr%') or @vchOne like ('Ms%')

      BEGIN

      select @vchPrefix  = @vchOne

      select @vchFirstName  = @vchTwo

      select @vchLastName  = @vchThree

     END

     ELSE

      IF @vchThree not like ('Jr%') and @vchThree not like ('Sr%')

           and @vchThree not like ('III') and @vchThree not like ('3')

          and @vchThree not like ('IV') and @vchThree not like ('4')

          and @vchThree not like ('V') and @vchThree not like ('5')

      BEGIN

       select @vchFirstName  = @vchOne

       select @vchMiddleName  = @vchTwo

       select @vchLastName  = @vchThree

      END

     IF @vchThree like ('Jr%') or @vchThree like ('Sr%')

           or @vchThree like ('III') or @vchThree like ('3')

          or @vchThree like ('IV') or @vchThree like ('4')

          or @vchThree like ('V') or @vchThree like ('5')

      BEGIN

       select @vchFirstName  = @vchOne

       select @vchLastName   = @vchTwo

       select @vchSuffix  = @vchThree

      END

    END

    IF @tiRecCount = 4

    BEGIN

     IF @vchOne like ('Mr%') or @vchOne like ('Dr%') or @vchOne like ('Ms%')

      IF @vchFour not like ('Jr%') and @vchFour not like ('Sr%')

           and @vchFour not like ('III') and @vchFour not like ('3')

          and @vchFour not like ('IV') and @vchFour not like ('4')

          and @vchFour not like ('V') and @vchFour not like ('5') 

      BEGIN

       select @vchPrefix = @vchOne

       select @vchFirstName  = @vchTwo

       select @vchMiddleName = @vchThree

       select @vchLastName  = @vchFour

      END

      ELSE

      BEGIN

       select @vchPrefix  = @vchOne

       select @vchFirstName  = @vchTwo

       select @vchLastName   = @vchThree

       select @vchSuffix  = @vchFour

      END

     ELSE

     BEGIN

      select @vchFirstName  = @vchOne

      select @vchMiddleName  = @vchTwo

      select @vchLastName  = @vchThree

      select @vchSuffix  = @vchFour

     END

    END

    IF @tiRecCount = 5

    BEGIN

     select @vchPrefix  = @vchOne

     select @vchFirstName  = @vchTwo

     select @vchMiddleName  = @vchThree

     select @vchLastName  = @vchFour

     select @vchSuffix  = @vchFive

    END

    RETURN

    GO

     

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply