Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

Function with 200+ IF statements! Expand / Collapse
Author
Message
Posted Tuesday, February 19, 2013 5:13 AM


SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Today @ 10:21 AM
Points: 489, Visits: 2,140
I have a lot of respect for the optimisation gurus on this forum so when one of these guys (Jeff Moden) makes a comment like in this thread http://www.sqlservercentral.com/Forums/Topic1420602-3077-2.aspx#bm1420669 then I get really worried.


I've also found that if a function has over 200 IF statements, that it's time to go back to the drawing board. Many such "hopeless" functions frequently do have a much more effective and easy to code solution. there may also be the case where a stored procedure is more appropriate than a function


This has been on my mind since the end of last week so now I want to put this issue to sleep by opening a new discussion to see if any of you good people can help me!

The function in question is used to determine whether two records have the same Firstname, Middlename and Surname.

Consider the following example of two records:



It is clear that these two records refer to the same person but doing a self join on ContactName, Address1 and Postcode would not return these two records as duplicates because there are differences in Address1 and name.

Another approach is to join the records only on postcode then compare the name elements to see calculate a matching score for the name part.

SELECT 
a.contactname as master_contactname,
b.contactname as duplicate_contactname
FROM someTable as a
inner join someTable as b
on a.Postcode = b.PostCode
WHERE a.ID < b.ID



My Name scoring function takes 8 parameters (4 for the master record and the other 4 of the potential duplicate record)

1) NormalisedName (Forname, Surname, Middlename)
2) PhoneticForename
3) PhoneticSurname
4) PhoneticMiddlename

Now this is where the 200 + IF statements start and the reason for them is as follows:

IF Master_Surname = Duplicate_Surname AND Master_Firstname = Duplicate_Firstname THEN
the ultimate outcome of the name comparison will depend on how the middle names compare BUT the middle names can have 5 different possibilities

IF Master_Middlename = Duplicate_Middlename then EXACT match
IF Master_Middlename is empty or Duplicate_Middlename is empty then LIKELY match
IF Master_Middlename SOUNDS the same as Duplicate_Middlename then LIKELY match


...etc ...etc

I also have to account for when Master_Surname SOUNDS like Duplicate Surname and for this I also have to look at the firstname and middlename

Hope this makes sense?

I can't see how else to do this without using so many IFs?

In the end I created a SQL CLR in C# which is mega fast compared to the SQL UDF equivalent.

Any suggestions?


-----------------------------------
http://www.SQL4n00bs.com
Post #1421571
Posted Tuesday, February 19, 2013 5:26 AM


SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Wednesday, May 08, 2013 10:24 AM
Points: 237, Visits: 413
I would have used a multi step UNION, with one of the columns returned the rank value you need. That would be more optimal than the UDF. I would have to test it out to see it compare against the CLR.
Post #1421585
Posted Tuesday, February 19, 2013 5:28 AM


SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Today @ 10:21 AM
Points: 489, Visits: 2,140
I don't understand what you mean by a multi step UNION. Can you give a simple example?

-----------------------------------
http://www.SQL4n00bs.com
Post #1421588
Posted Tuesday, February 19, 2013 5:43 AM


SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Wednesday, May 08, 2013 10:24 AM
Points: 237, Visits: 413
I whipped this up quick, and looking at it you would probably want to combine all the EXACTSinto one UNION etc... but I broke them out so you can see what I was doing.

Something like this:

CREATE PROCEDURE dbo.ProcName
(@ID1 INT,
@ID2 INT)
AS
BEGIN

SELECT A.id,'EXACT'
FROM sometable A
CROSS APPLY sometable B
WHERE A.ID=@id1
AND B.id=@id2
AND A.surname=B.surname
AND A.firstname=B.firstname
UNION
SELECT A.id,'EXACT'
FROM sometable A
CROSS APPLY sometable B
WHERE A.ID=@id1
AND B.id=@id2
AND A.middle=B.middle
UNION
SELECT A.id,'LIKELY'
FROM sometable A
CROSS APPLY sometable B
WHERE A.ID=@id1
AND B.id=@id2
AND (A.middle=NULL OR A.middle =''
OR A.middle=NULL OR A.middle ='')

etc .....


END

Post #1421592
Posted Tuesday, February 19, 2013 8:10 AM


SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Today @ 10:21 AM
Points: 489, Visits: 2,140
Thanks for this. I see what you're doing but I will have to play around with it to get it to fit with my name matching matrix which looks something like the below: (there are other ones for when the surnames SOUND the same, Surnames approximately the same, surnames sound approximately the same, surnames containment, surname sound containment etc....

    <lastnames match="equal">

<firstnames match="equal">
<middlenames match="equal">sure</middlenames>
<middlenames match="both_empty">sure</middlenames>
<middlenames match="one_empty">sure</middlenames>
<middlenames match="approx">likely</middlenames>
<middlenames match="contains">likely</middlenames>
<middlenames match="unequal">possible</middlenames>
</firstnames>

<firstnames match="sounds_equal">
<middlenames match="equal">sure</middlenames>
<middlenames match="both_empty">likely</middlenames>
<middlenames match="one_empty">likely</middlenames>
<middlenames match="approx">possible</middlenames>
<middlenames match="contains">possible</middlenames>
<middlenames match="unequal">zero</middlenames>
</firstnames>

<firstnames match="both_empty">
<middlenames match="equal">zero</middlenames>
<middlenames match="both_empty">likely</middlenames>
<middlenames match="one_empty">zero</middlenames>
<middlenames match="approx">zero</middlenames>
<middlenames match="contains">zero</middlenames>
<middlenames match="unequal">zero</middlenames>
</firstnames>

<firstnames match="one_empty">
<middlenames match="equal">zero</middlenames>
<middlenames match="both_empty">likely</middlenames>
<middlenames match="one_empty">likely</middlenames>
<middlenames match="approx">zero</middlenames>
<middlenames match="contains">likely</middlenames>
<middlenames match="unequal">zero</middlenames>
</firstnames>

<firstnames match="approx">
<middlenames match="equal">likely</middlenames>
<middlenames match="both_empty">likely</middlenames>
<middlenames match="one_empty">likely</middlenames>
<middlenames match="approx">possible</middlenames>
<middlenames match="contains">possible</middlenames>
<middlenames match="unequal">zero</middlenames>
</firstnames>

<firstnames match="sounds_approx">
<middlenames match="equal">likely</middlenames>
<middlenames match="both_empty">possible</middlenames>
<middlenames match="one_empty">possible</middlenames>
<middlenames match="approx">zero</middlenames>
<middlenames match="contains">possible</middlenames>
<middlenames match="unequal">zero</middlenames>
</firstnames>

<firstnames match="contains">
<middlenames match="equal">likely</middlenames>
<middlenames match="both_empty">likely</middlenames>
<middlenames match="one_empty">likely</middlenames>
<middlenames match="approx">possible</middlenames>
<middlenames match="contains">possible</middlenames>
<middlenames match="unequal">zero</middlenames>
</firstnames>

<firstnames match="unequal">
<middlenames match="equal">zero</middlenames>
<middlenames match="both_empty">zero</middlenames>
<middlenames match="one_empty">zero</middlenames>
<middlenames match="approx">zero</middlenames>
<middlenames match="contains">possible</middlenames>
<middlenames match="unequal">zero</middlenames>
</firstnames>

</lastnames>



-----------------------------------
http://www.SQL4n00bs.com
Post #1421662
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse