Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

Help with Derived Column To Split-Up An Individual's Name into Separate Parts Expand / Collapse
Author
Message
Posted Friday, April 11, 2014 9:32 AM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Friday, November 7, 2014 3:40 PM
Points: 38, Visits: 248
I'm working on an SSIS package for a client. They receive files exported from their clients and they don't put any restrictions on how the data is provided to them via the Excel files they send, so 99.9% of the files arrive with a "Full Name" field instead of individual "First Name, "Middle Initial/Name", "Last Name", and "Surname" columns.

The names are generally in this pattern, which has a few possible variances as you can see as far as how the names are present in the column:

Adams, Jr., Dwight
([Last Name], [Surname], [First Name])

Allen, Frederick
([Last Name], [First Name])

Bishop, Joseph A.
([Last Name], [First Name] [Middle Initial])

Butler, III, Lamarrick J.
([Last Name], [Surname], [First Name] [Middle Initial])

Nicholson, Tina Michelle
([Last Name], [First Name] [Middle Name])

Cagle, Sr., Kimzey Alan
([Last Name], [Surname], [First Name] [Middle Name])

I'm trying to determine exactly how to handle these varying scenarios as far as how the names will be present in the single column as to appropriate & accurately split them into their various parts.

I wasn't sure if I should go with parsing formulas for each component of the full name directly within the derived column task, or if I should write a function that accepts the full name as an incoming parameter and then the parts of the name are parsed that way and somehow returned to the derived column task. I'm trying to tackle this the best way possible and I'm seeking suggestions.

I've seen a few code samples out there for parsing through a full name into the various individual parts of the name, but I haven't run across any specific examples yet that deal with full names structured exactly the way that I've shown above in the examples that they can be structured in the incoming files.

Thanks in advance.
Post #1561016
Posted Friday, April 11, 2014 9:47 AM
Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: 2 days ago @ 3:34 PM
Points: 3,109, Visits: 11,516
As you probably know already, this is a very complex problem, especially when you are dealing with multipart names, non-western names, etc.

You may want to invest in commercial software that is designed to handle this problem:
http://www.softwarecompany.com/netgender-api.html



Post #1561030
Posted Sunday, April 13, 2014 4:38 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Today @ 10:12 AM
Points: 5,259, Visits: 12,195
The complexity of the problem means that a derived column solution is unlikely to be practical.

A Script Component would allow you to write code with as much complexity as you need, to handle the various combinations you have described, as long as you can come up with a set of logic rules to apply in order to perform the parsing.



Help us to help you. For better, quicker and more-focused answers to your questions, consider following the advice in this link.

When you ask a question (and please do ask a question: "My T-SQL does not work" just doesn't cut it), please provide enough information for us to understand its context.
Post #1561269
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse