Function to split name field

  • Good afternoon,

    With a bit of effort this morning, I've managed to get my function to split my name field. There are three tables required by the function:

    Performance is not too bad but not too great. Can I have some second opinions on this please?

    create table dbo.NameSplit_Title_Table (ID int not null identity(1, 1), Title varchar(50))

    insert into NameSplit_Title_Table

    select 'Abbot' union all

    select 'Arq' union all

    select 'Baroness' union all

    select 'Brig' union all

    select 'Brigadier' union all

    select 'Canon' union all

    select 'Capitaine' union all

    select 'Capt' union all

    select 'Captai' union all

    select 'Captain' union all

    select 'Chief' union all

    select 'Cmdr' union all

    select 'CMDT' union all

    select 'Col' union all

    select 'Colonel' union all

    select 'Commander' union all

    select 'Commodore' union all

    select 'Councillor' union all

    select 'Count' union all

    select 'Countess' union all

    select 'Cpt' union all

    select 'Dame' union all

    select 'De Heer' union all

    select 'Deacon' union all

    select 'Deaconess' union all

    select 'Detect' union all

    select 'Dhr' union all

    select 'District Judge' union all

    select 'Doctor' union all

    select 'Dokter' union all

    select 'Dr' union all

    select 'DRS' union all

    select 'F Lt' union all

    select 'Father' union all

    select 'Flight Lieut' union all

    select 'Flt Lieutenant' union all

    select 'Flt Lt' union all

    select 'Frau' union all

    select 'Frauline' union all

    select 'Fthr' union all

    select 'Gen' union all

    select 'General' union all

    select 'Group Capt' union all

    select 'Group Captain' union all

    select 'Heer' union all

    select 'Herr' union all

    select 'Hon' union all

    select 'Hr' union all

    select 'Ing' union all

    select 'Insp' union all

    select 'Inspec' union all

    select 'Inspector' union all

    select 'Ir' union all

    select 'Lady' union all

    select 'Lcol' union all

    select 'Lieut Col' union all

    select 'Lieut Colonel' union all

    select 'Lieutenant' union all

    select 'Lord' union all

    select 'Lt' union all

    select 'Lt C' union all

    select 'Lt Cmdr' union all

    select 'Lt Cmr' union all

    select 'Lt Col' union all

    select 'Ltcdr' union all

    select 'Madam' union all

    select 'Madame' union all

    select 'Maj' union all

    select 'Maj Gen' union all

    select 'Major' union all

    select 'Major General' union all

    select 'Marquis' union all

    select 'Master' union all

    select 'Meister' union all

    select 'Mej' union all

    select 'Mevrouw' union all

    select 'Miss' union all

    select 'Mjr' union all

    select 'Mlle' union all

    select 'Mme' union all

    select 'Mon' union all

    select 'Monsieur' union all

    select 'Mother' union all

    select 'Mr' union all

    select 'Mrs' union all

    select 'Ms' union all

    select 'Mstr' union all

    select 'Mthr' union all

    select 'Pastoor' union all

    select 'Pastor' union all

    select 'Prof' union all

    select 'Professor' union all

    select 'Rev' union all

    select 'Rev Canon' union all

    select 'Rev''d' union all

    select 'Revd' union all

    select 'Reverend' union all

    select 'Señorita' union all

    select 'Sgt' union all

    select 'Sig' union all

    select 'Sir' union all

    select 'Sis' union all

    select 'Sister' union all

    select 'Snr' union all

    select 'Sqn' union all

    select 'Sqn Ldr' union all

    select 'Squadron Leader' union all

    select 'Sr' union all

    select 'Sra' union all

    select 'Str' union all

    select 'The Honourable' union all

    select 'The Very Reverand' union all

    select 'Very Rev' union all

    select 'Wg Cdr' union all

    select 'Wing Cdr' union all

    select 'Wing Comdr' union all

    select 'Wing Commander'

    create table dbo.NameSplit_Prefix_Table (ID int not null identity(1, 1), Prefix varchar(50))

    insert into dbo.NameSplit_Prefix_Table

    select 'Du' union all

    select 'De' union all

    select 'Von' union all

    select 'Van' union all

    select 'Da' union all

    select 'Le' union all

    select 'La' union all

    select 'St' union all

    select 'El' union all

    select 'Dos'

    create table dbo.NameSplit_suffix_table (ID int not null identity(1, 1), Suffix varchar(50))

    insert into NameSplit_suffix_table

    select 'Jr' union all

    select 'Phd' union all

    select 'Msc' union all

    select 'Sr' union all

    select 'Hons' union all

    select 'Jnr' union all

    select 'Snr' union all

    select 'Obe' union all

    select 'Cbe' union all

    select 'Mbe' union all

    select 'Ba' union all

    select 'Bsc' union all

    select 'II' union all

    select 'III' union all

    select 'MCIOB' union all

    select 'FCIOB' union all

    select 'FRICS' union all

    select 'MRICS' union all

    select 'Esq'

    The function:

    CREATE FUNCTION [dbo].[fn_NameSplit_Step1] (@name varchar(100))

    RETURNS varchar(200) AS

    BEGIN

    declare @salutation varchar(100)

    declare @forename varchar(100)

    declare @initials varchar(100)

    declare @surname varchar(100)

    declare @suffix varchar(100)

    declare @myword varchar(100)

    declare @icount integer

    declare @icountprevious integer

    declare @namelen integer

    declare @returnval varchar(100)

    if @name is not null

    begin

    set @icount = 1

    set @icountprevious = 1

    set @myword = null

    set @salutation = ''

    set @forename = ''

    set @initials = ''

    set @surname = ''

    set @Suffix = ''

    -- Search suffix and remove from @name

    select @myword = Suffix from NameSplit_suffix_table where patindex('% ' + Suffix ,@name)<> 0 or @name = Suffix

    while @@rowcount <> 0

    begin

    set @suffix = ltrim(substring(@name, len(@name) - len(@myword), len(@myword)+1) + ' ' + @suffix)

    set @name = rtrim(substring(@name, 1, len(@name) - len(@myword)))

    select @myword = Suffix from NameSplit_suffix_table where patindex('% ' + Suffix ,@name)<> 0 or @name = Suffix

    end

    -- Search Salutation and remove from @name

    select @myword = Title from NameSplit_Title_Table where patindex(Title + ' %',@name)<> 0 or @name = Title

    while @@rowcount <> 0

    begin

    set @salutation = ltrim(@salutation + ' ' + substring(@name, 1, len(@myword)))

    set @name = ltrim(substring(@name, len(@myword)+1 ,len(@name) - len(@myword)))

    select @myword = Title from NameSplit_Title_Table where patindex(Title + ' %',@name)<> 0 or @name = Title

    end

    -- Split @name into words

    set @namelen = len(@name)

    while @icount < @namelen

    begin

    if substring(@name, @icount,1) = ' '

    begin

    set @myword = rtrim(ltrim(substring(@name, @icountprevious, @icount - @icountprevious)))

    -- if word is not surname prefix, then

    -- add to @forename with first letter added to @initials

    if not exists (select * from NameSplit_Prefix_Table where Prefix = @myword)

    begin

    set @forename = ltrim(@forename + ' ' + @myword)

    set @initials = ltrim(@initials + ' ' + left(@myword,1))

    set @icountprevious = @icount

    end

    -- else exit loop

    else break

    end

    set @icount = @icount + 1

    end

    -- Add remaining to Surname

    if @surname ='' set @surname = ltrim(substring(@name, @icountprevious ,@namelen - @icountprevious +1))

    end

    set @returnval = '|1'+ @salutation + '|2'+ @forename + '|3'+ @initials + '|4'+ @surname + '|5'+ @suffix

    return @returnval

    END

    Test case:

    select dbo.[fn_NameSplit_Step1] ('Professor Ludvic Van Fritz')

    --|1Professor|2Ludvic|3L|4Van Fritz|5

    ---------------------------------------------------------

    It takes a minimal capacity for rational thought to see that the corporate 'free press' is a structurally irrational and biased, and extremely violent, system of elite propaganda.
    David Edwards - Media lens[/url]

    Society has varying and conflicting interests; what is called objectivity is the disguise of one of these interests - that of neutrality. But neutrality is a fiction in an unneutral world. There are victims, there are executioners, and there are bystanders... and the 'objectivity' of the bystander calls for inaction while other heads fall.
    Howard Zinn

  • I have to ask why you have "Señorita" but not "Señor", and a few others like that. Or did I just miss them in the list?

    As far as the code goes, you might be better off moving this kind of thing into a CLR object. Performance will probably be better there, as T-SQL is poor at both string functions and loops, while CLR is good at both of those.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Thanks for the feedback GSquared.

    I started off with a small list for my titles but then I added some more from a list of titles I found online 😀

    ---------------------------------------------------------

    It takes a minimal capacity for rational thought to see that the corporate 'free press' is a structurally irrational and biased, and extremely violent, system of elite propaganda.
    David Edwards - Media lens[/url]

    Society has varying and conflicting interests; what is called objectivity is the disguise of one of these interests - that of neutrality. But neutrality is a fiction in an unneutral world. There are victims, there are executioners, and there are bystanders... and the 'objectivity' of the bystander calls for inaction while other heads fall.
    Howard Zinn

  • As a proof-of-concept exercise, I built a "set-based" version of your UDF. It doesn't use any explicit loops, and so on. I tested the two side-by-side, and the one you have is about 4X faster than the set-based version. Because of the complexity of the rules here, it's massively over-complex, including multiple CTEs, some Outer Apply functions, and so on. Just a mess.

    So CLR is almost certainly your best bet, as expected.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Thanks CSGuard. I will try and create a CLR implementation if time permits and will report back.

    ---------------------------------------------------------

    It takes a minimal capacity for rational thought to see that the corporate 'free press' is a structurally irrational and biased, and extremely violent, system of elite propaganda.
    David Edwards - Media lens[/url]

    Society has varying and conflicting interests; what is called objectivity is the disguise of one of these interests - that of neutrality. But neutrality is a fiction in an unneutral world. There are victims, there are executioners, and there are bystanders... and the 'objectivity' of the bystander calls for inaction while other heads fall.
    Howard Zinn

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply