SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Split one Name Column into Two Colmns using SSIS


Split one Name Column into Two Colmns using SSIS

Author
Message
brian.brubaker
brian.brubaker
Grasshopper
Grasshopper (21 reputation)Grasshopper (21 reputation)Grasshopper (21 reputation)Grasshopper (21 reputation)Grasshopper (21 reputation)Grasshopper (21 reputation)Grasshopper (21 reputation)Grasshopper (21 reputation)

Group: General Forum Members
Points: 21 Visits: 29
I am working with SQL Server 2012 and Visual Studio 2010 with Excel 2007. I am building a package that will pull information from a .CSV file and import it into a table in a DB that we are creating for an HRIS project.

The .CSV file has the following fields starting in B9 and running through column J (row length will vary)

| PersNo | IDNo | Name | PArea | OrgUnit | OrgName | UserID | EntryDate |

The issue I have is that I need to take the Name column and split it into a FirstName and LastName columns. I have scoured the internet and found a number of forums that detail methods that use Derived Columns, Conditional Split, Script Component and MultiCast. None of them were helpful or fit my needs.

The issue is complicated by the name arrangement. I have three different name types that can be displayed.

John Smith
John A Smith
John Smith III

The last two they will need to be split differently.

If there is a middle initial they want it split as such.

| FirstName | LastName |
------------------------
| John | Smith A |

i.e. Smith and the Middle Initial will be in the LastName column

If they have a suffix then they will need to be split as such.

| FirstName | LastName |
-------------------------
| John | Smith III |

i.e. Smith and the suffix will be in the LastName column

I need to know if there is a way that I can split this out in one package. Eventually this will be automated to run daily so I will also have to utilize a lookup I assume to filter out any duplicates and enter only new data.

I appreciate anything information that anyone can provide.
tomgough
tomgough
SSC Journeyman
SSC Journeyman (88 reputation)SSC Journeyman (88 reputation)SSC Journeyman (88 reputation)SSC Journeyman (88 reputation)SSC Journeyman (88 reputation)SSC Journeyman (88 reputation)SSC Journeyman (88 reputation)SSC Journeyman (88 reputation)

Group: General Forum Members
Points: 88 Visits: 125
Could you also get a middle initial and a suffix?
Or more than one middle initial?

Sorry to complicate matters, but this could impact how to go about solving it
brian.brubaker
brian.brubaker
Grasshopper
Grasshopper (21 reputation)Grasshopper (21 reputation)Grasshopper (21 reputation)Grasshopper (21 reputation)Grasshopper (21 reputation)Grasshopper (21 reputation)Grasshopper (21 reputation)Grasshopper (21 reputation)

Group: General Forum Members
Points: 21 Visits: 29
Yes, there can ba a situation where there is a middle initial and a suffix.
However, there will not be an issue with 2 middle initials. The system we use only allows one character for middle initial.
tomgough
tomgough
SSC Journeyman
SSC Journeyman (88 reputation)SSC Journeyman (88 reputation)SSC Journeyman (88 reputation)SSC Journeyman (88 reputation)SSC Journeyman (88 reputation)SSC Journeyman (88 reputation)SSC Journeyman (88 reputation)SSC Journeyman (88 reputation)

Group: General Forum Members
Points: 88 Visits: 125
OK, then I think my approach would be to split up the string into its component parts. If you get 2 pieces you must have just first name, surname. If you get 4, it will be 1st name, initial, surname, suffix.

If you get 3 and the second is a single character, I think it is safe to assume that is a middle initial. If not, that it is a surname so the last element is a suffix. When you know what each component is, reconstitute it in the format you require in the end
brian.brubaker
brian.brubaker
Grasshopper
Grasshopper (21 reputation)Grasshopper (21 reputation)Grasshopper (21 reputation)Grasshopper (21 reputation)Grasshopper (21 reputation)Grasshopper (21 reputation)Grasshopper (21 reputation)Grasshopper (21 reputation)

Group: General Forum Members
Points: 21 Visits: 29
Thank you for the response.
I actually ended up using a suggestion from another forum that worked out really well.
The moderator there helped me to put together a Script Component that took into account all factors.

public override void Input0_ProcessInputRow(Input0Buffer Row)
{
const string pFirstName = @"^[A-Z]([-']?[a-z]+)*";
const string pSuffix = @"((Jr|Sr|I|V|X)( ?))*";
const string pInitial = @"(?<=\s)[A-Z](?=\s)";
const string pLastName = @"(?!(?:Jr|Sr|I|V|X|^))([A-Z][-'\s]?[a-z]+)";

string fullName = Row.Name.ToString();
string firstName = Regex.Match(fullName, pFirstName).Value;
string suffix = Regex.Match(fullName, pSuffix).Value;
string initial = Regex.Match(fullName, pInitial).Value;
string lastName = Regex.Match(fullName, pLastName).Value;

if (!string.IsNullOrEmpty(initial))
lastName += " " + initial;

if (!string.IsNullOrEmpty(suffix))
lastName += " " + suffix;

Row.FirstName = firstName;
Row.LastName = lastName;
}

I figured I would share it in hopes of it assisting others that may have a similar situation.
Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (84K reputation)SSC Guru (84K reputation)SSC Guru (84K reputation)SSC Guru (84K reputation)SSC Guru (84K reputation)SSC Guru (84K reputation)SSC Guru (84K reputation)SSC Guru (84K reputation)

Group: General Forum Members
Points: 84677 Visits: 41069
It'll be interesting what you do with last names like "van der waals" and first names like "Mary Jo" as well as people who have alphabet-soup after their last names and titles before their first names.

My recommendation would be to forget about splitting names especially since most people end up doing "fuzzy" lookups for names anyway. Unless you have an application that has people populate separate on-screen fields to begin with, trying to splits on names can be terribly frustrating.

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search