SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


MetaPhone - Implementing


MetaPhone - Implementing

Author
Message
JuanBob
JuanBob
Ten Centuries
Ten Centuries (1K reputation)Ten Centuries (1K reputation)Ten Centuries (1K reputation)Ten Centuries (1K reputation)Ten Centuries (1K reputation)Ten Centuries (1K reputation)Ten Centuries (1K reputation)Ten Centuries (1K reputation)

Group: General Forum Members
Points: 1042 Visits: 269

I can find many articles on MetaPhone, but nothing that is understandable on how to implement it and use in Query Analyzer

I need to use this type of function to de-dupe my database. SoundEx is very bad. Any suggestions or resources you can recommend would be much appreciated!

Thanks


Shane Petroff
Shane Petroff
Valued Member
Valued Member (66 reputation)Valued Member (66 reputation)Valued Member (66 reputation)Valued Member (66 reputation)Valued Member (66 reputation)Valued Member (66 reputation)Valued Member (66 reputation)Valued Member (66 reputation)

Group: General Forum Members
Points: 66 Visits: 365
first google hit...

http://www.windowsitpro.com/Article/ArticleID/26094/26094.html?Ad=1

alternatively, just wrap Lawrence Phillips C++ version
Russell Shilling
Russell Shilling
UDP Broadcaster
UDP Broadcaster (1.5K reputation)UDP Broadcaster (1.5K reputation)UDP Broadcaster (1.5K reputation)UDP Broadcaster (1.5K reputation)UDP Broadcaster (1.5K reputation)UDP Broadcaster (1.5K reputation)UDP Broadcaster (1.5K reputation)UDP Broadcaster (1.5K reputation)

Group: General Forum Members
Points: 1468 Visits: 209

I'm not sure how "understandable" this will be, as the algorithm is very complex, but here's an implementation I found on the net some time ago. (The ) are close parenthesis -- How do you turn off the smilie substitution, or format code?)


declare @ret char(20)
exec @ret = doublemetaphone 'rachael'
print @ret
exec @ret = doublemetaphone 'rachel'
print @ret
exec @ret = doublemetaphone 'richelle'
print @ret

/* Metaphone */

--**************************************
--
-- Name: dbo.DoubleMetaPhone
-- Description:Creates a better phonetic
-- matching key than Soundex.
-- By: Keith Henry
--
-- Inputs: proper name to generate key
--
-- Returns: char(10) string containing two
-- char(5) keys
-- The first 5 chars are the most common pronouciation, AND the last 5 are the next most common.
--
-- Assumes: This is based off a c++ article that described a better phonetic algorithm than soundex.
-- I re-wrote the algorithm IN t-SQL (the languages are too different to simply translate) so that
-- I could USE it WITH our UK Market File. Soundex IS OK, but it is OVER 100 years old now AND it
-- misses an awful lot OF names that should match.
--
-- Side Effects: More load than soundex, but still much better.
--
-- This code is copyrighted and has-- limited warranties.Please see http://
-- www.Planet-Source-Code.com/vb/scripts/Sh
-- owCode.asp?txtCodeId=519&lngWId=5--for details.--**************************************
--

ALTER FUNCTION dbo.DoubleMetaPhone (@str varchar(70))
RETURNS char(20)
AS


BEGIN

/*#########################################################################

DOUBLE Metaphone Phonetic Matching Function

This reduces word TO approximate phonetic string. This is deliberately
NOT a direct phonetic

Based OFF original C++ code AND algorithm BY
Lawrence Philips (lphilips_AT_verity.com)

Published IN the C/C++ Users Journal:
http://www.cuj.com/articles/2000/0006/0006d/0006d.htm?topic=articles

Original Metaphone presented in article in "Computer Language" in 1990.

Reduces alphabet TO

The 14 constonant sounds:
"sh""p"or"b" "th"
| | |
X S K J T F H L M N P R 0 W

DROP vowels EXCEPT at the beginning

Produces a char(10) string. The left(@result,5) gives the most common
pronouciation, right(@result,5) gives the commonest alternate.


Translated INTO t-SQL BY Keith Henry (keithh@lbm-solutions.com)

#########################################################################*/

DECLARE @original varchar(70),
@primary varchar(70),
@secondary varchar(70),
@length int,
@last int,
@current int,
@strcur1 char(1) ,
@strnext1 char(1) ,
@strprev1 char(1),
@SlavoGermanic bit

SET @SlavoGermanic = 0
SET @primary = ''
SET @secondary = ''
SET @current = 1
SET @length = len(@str)
SET @last = @length
SET @original = lTrim(isnull(@str,'')) + ' '

SET @original = upper(@original)

IF patindex('%[WK]%',@str) + charindex('CZ',@str) + charindex('WITZ',@str) <> 0
SET @SlavoGermanic = 1

-- skip this at beginning OF word
IF substring(@original, 1, 2) in ('GN', 'KN', 'PN', 'WR', 'PS')
SET @current = @current + 1

-- Initial 'X' IS pronounced 'Z' e.g. 'Xavier'
IF substring(@original, 1, 1) = 'X'
BEGIN
SET @primary = @primary + 'S'-- 'Z' maps TO 'S'
SET @secondary = @secondary + 'S'
SET @current = @current + 1
END

IF substring(@original, 1, 1) in ('A', 'E', 'I', 'O', 'U', 'Y')
BEGIN
SET @primary = @primary + 'A' -- ALL init vowels now map TO 'A'
SET @secondary = @secondary + 'A'
SET @current = @current + 1
END

WHILE @current <= @length
BEGIN
IF len(@primary) >= 10 BREAK

SET @strcur1 = substring(@original, @current, 1)
SET @strnext1 = substring(@original, (@current + 1), 1)
SET @strprev1 = substring(@original, (@current - 1), 1)

IF @strcur1 IN ('A', 'E', 'I', 'O', 'U', 'Y', ' ', '''', '-')
SET @current = @current + 1
ELSE

IF @strcur1 = 'B' -- '-mb', e.g. 'dumb', already skipped OVER...
BEGIN
SET @primary = @primary + 'P'
SET @secondary = @secondary + 'P'

IF @strnext1 = 'B'
SET @current = @current + 2
ELSE
SET @current = @current + 1
END
ELSE

IF @strcur1 = 'Ç'
BEGIN
SET @primary = @primary + 'S'
SET @secondary = @secondary + 'S'
SET @current = @current + 1
END
ELSE

IF @strcur1 = 'C'
BEGIN
IF @strnext1 = 'H'
BEGIN

IF substring(@original, @current, 4) = 'CHIA' -- italian 'chianti'
BEGIN
SET @primary = @primary + 'K'
SET @secondary = @secondary + 'K'
END
ELSE
BEGIN
IF @current > 1 -- find 'michael'
AND substring(@original, @current, 4) = 'CHAE'
BEGIN
SET @primary = @primary + 'K'
SET @secondary = @secondary + 'X'
END
ELSE
BEGIN
IF @current = 1 -- greek roots e.g. 'chemistry', 'chorus'
AND (substring(@original, @current + 1, 5) in ('HARAC', 'HARIS')
OR substring(@original, @current + 1, 3) in ('HOR', 'HYM', 'HIA', 'HEM')
 
AND substring(@original, 1, 5) <> 'CHORE'
BEGIN
SET @primary = @primary + 'K'
SET @secondary = @secondary + 'K'
END
ELSE
BEGIN
IF ( substring(@original, 0, 4) in ('VAN ', 'VON ') -- germanic, greek, or otherwise 'ch' FOR 'kh' sound
OR substring(@original, 0, 3) = 'SCH'
 
OR substring(@original, @current - 2, 6) in ('ORCHES', 'ARCHIT', 'ORCHID') -- 'architect' but NOT 'arch', orchestra', 'orchid'
OR substring(@original, @current + 2, 1) in ('T', 'S')
OR ( ( @strprev1 IN ('A','O','U','E')
OR @current = 1
 
AND substring(@original, @current + 2, 1) in ('L','R','N','M','B','H','F','V','W',' ') -- e.g. 'wachtler', 'weschsler', but NOT 'tichner'
 
BEGIN
SET @primary = @primary + 'K'
SET @secondary = @secondary + 'K'
END
ELSE
BEGIN
IF (@current > 1)
BEGIN
IF substring(@original, 1, 2) = 'MC' -- e.g. 'McHugh'
BEGIN
SET @primary = @primary + 'K'
SET @secondary = @secondary + 'K'
END
ELSE
BEGIN
SET @primary = @primary + 'X'
SET @secondary = @secondary + 'K'
END
END
ELSE
BEGIN
SET @primary = @primary + 'X'
SET @secondary = @secondary + 'X'
END
END
END
END
END
SET @current = @current + 2
END --ch logic
ELSE
BEGIN
IF @strnext1 = 'C' -- DOUBLE 'C', but NOT McClellan'
AND not(@current = 1
AND substring(@original, 1, 1) = 'M'
 
BEGIN
IF substring(@original, @current + 2, 1) in ('I','E','H') -- 'bellocchio' but NOT 'bacchus'
AND substring(@original, @current + 2, 2) <> 'HU'
BEGIN
IF ( @current = 2 -- 'accident', 'accede', 'succeed'
AND @strprev1 = 'A'
 
OR substring(@original, @current - 1, 5) in ('UCCEE', 'UCCES')
BEGIN
SET @primary = @primary + 'KS'
SET @secondary = @secondary + 'KS'
END
ELSE
BEGIN -- 'bacci', 'bertucci', other italian
SET @primary = @primary + 'X'
SET @secondary = @secondary + 'X'
-- e.g. 'focaccia' IF substring(@original, @current, 4) = 'CCIA'
END
SET @current = @current + 3
END
ELSE
BEGIN
SET @primary = @primary + 'K' -- Pierce's RULE
SET @secondary = @secondary + 'K'
SET @current = @current + 2
END
END
ELSE
BEGIN
IF @strnext1 IN ('K','G','Q')
BEGIN
SET @primary = @primary + 'K'
SET @secondary = @secondary + 'K'
SET @current = @current + 2
END
ELSE
BEGIN
IF @strnext1 IN ('I','E','Y')
BEGIN
IF substring(@original, @current, 3) in ('CIO','CIE','CIA') -- italian vs. english
BEGIN
SET @primary = @primary + 'S'
SET @secondary = @secondary + 'X'
END
ELSE
BEGIN
SET @primary = @primary + 'S'
SET @secondary = @secondary + 'S'
END
SET @current = @current + 2
END
ELSE
BEGIN
IF @strnext1 = 'Z' -- e.g. 'czerny'
AND substring(@original, @current -2, 4) <> 'WICZ'
BEGIN
SET @primary = @primary + 'S'
SET @secondary = @secondary + 'X'
SET @current = @current + 2
END
ELSE
BEGIN
IF @current > 2 -- various gremanic
AND substring(@original, @current - 2,1) NOT in ('A', 'E', 'I', 'O', 'U', 'Y')
AND substring(@original, @current - 1, 3) = 'ACH'
AND ((substring(@original, @current + 2, 1) <> 'I')
AND ((substring(@original, @current + 2, 1) <> 'E')
OR substring(@original, @current - 2, 6) in ('BACHER', 'MACHER')
 
 
BEGIN
SET @primary = @primary + 'K'
SET @secondary = @secondary + 'K'
SET @current = @current + 2
END
ELSE
BEGIN
IF @current = 1 -- special CASE 'caesar'
AND substring(@original, @current, 6) = 'CAESAR'

BEGIN
SET @primary = @primary + 'S'
SET @secondary = @secondary + 'S'
SET @current = @current + 2
END
ELSE
BEGIN -- final ELSE
SET @primary = @primary + 'K'
SET @secondary = @secondary + 'K'

IF substring(@original, @current + 1, 2) in (' C',' Q',' G') -- name sent in 'mac caffrey', 'mac gregor'
SET @current = @current + 3
ELSE
SET @current = @current + 1
END
END
END
END
END
END
END
END
ELSE

IF @strcur1 = 'D'
BEGIN
IF @strnext1 = 'G'
BEGIN
IF substring(@original, @current + 2, 1) in ('I','E','Y')
BEGIN
SET @primary = @primary + 'J' -- e.g. 'edge'
SET @secondary = @secondary + 'J'
SET @current = @current + 3
END
ELSE
BEGIN
SET @primary = @primary + 'TK' -- e.g. 'edgar'
SET @secondary = @secondary + 'TK'
SET @current = @current + 2
END
END
ELSE
BEGIN
IF substring(@original, @current, 2) in ('DT','DD')
BEGIN
SET @primary = @primary + 'T'
SET @secondary = @secondary + 'T'
SET @current = @current + 2
END
ELSE
BEGIN
SET @primary = @primary + 'T'
SET @secondary = @secondary + 'T'
SET @current = @current + 1
END
END
END
ELSE

IF @strcur1 = 'F'
BEGIN
SET @primary = @primary + 'F'
SET @secondary = @secondary + 'F'
IF (@strnext1 = 'F')
SET @current = @current + 2
ELSE
SET @current = @current + 1
END
ELSE

IF @strcur1 = 'G'
BEGIN
IF (@strnext1 = 'H')
BEGIN
IF @current > 1
AND @strprev1 NOT IN ('A', 'E', 'I', 'O', 'U', 'Y')
BEGIN
SET @primary = @primary + 'K'
SET @secondary = @secondary + 'K'
END
ELSE
BEGIN

IF not( (@current > 2 -- Parker's RULE (with SOME further refinements) - e.g. 'hugh'
AND substring(@original, @current - 2, 1) in ('B','H','D')
  -- e.g. 'bough'
OR (@current > 3
AND substring(@original, @current - 3, 1) in ('B','H','D')
  -- e.g. 'broughton'
OR (@current > 4
AND substring(@original, @current - 4, 1) in ('B','H')
  
BEGIN
IF @current > 3 -- e.g. 'laugh', 'McLaughlin', 'cough', 'gough', 'rough', 'tough'
AND @strprev1 = 'U'
AND substring(@original, @current - 3, 1) in ('C','G','L','R','T')
BEGIN
SET @primary = @primary + 'F'
SET @secondary = @secondary + 'F'
END
ELSE
BEGIN
IF @current > 1
AND @strprev1 <> 'I'
BEGIN
SET @primary = @primary + 'K'
SET @secondary = @secondary + 'K'
END
ELSE
BEGIN
IF (@current < 4)
BEGIN
IF (@current = 1) -- 'ghislane', 'ghiradelli'
BEGIN
IF (substring(@original, @current + 2, 1) = 'I')
BEGIN
SET @primary = @primary + 'J'
SET @secondary = @secondary + 'J'
END
ELSE
BEGIN
SET @primary = @primary + 'K'
SET @secondary = @secondary + 'K'
END
END
END
END
END
END
END
SET @current = @current + 2
END
ELSE
BEGIN
IF (@strnext1 = 'N')
BEGIN
IF @current = 1
AND substring(@original, 0,1) in ('A', 'E', 'I', 'O', 'U', 'Y')
AND @SlavoGermanic = 0
BEGIN
SET @primary = @primary + 'KN'
SET @secondary = @secondary + 'N'
END
ELSE
BEGIN
-- NOT e.g. 'cagney'
IF substring(@original, @current + 2, 2) = 'EY'
AND (@strnext1 <> 'Y')
AND @SlavoGermanic = 0
BEGIN
SET @primary = @primary + 'N'
SET @secondary = @secondary + 'KN'
END
ELSE
BEGIN
SET @primary = @primary + 'KN'
SET @secondary = @secondary + 'KN'
END
END
SET @current = @current + 2
END
ELSE
BEGIN
IF substring(@original, @current + 1, 2) = 'LI' -- 'tagliaro'
AND @SlavoGermanic = 0
BEGIN
SET @primary = @primary + 'KL'
SET @secondary = @secondary + 'L'
SET @current = @current + 2
END
ELSE
BEGIN
IF @current = 1 -- -ges-, -gep-, -gel- at beginning
AND (@strnext1 = 'Y'
OR substring(@original, @current + 1, 2) in ('ES','EP','EB','EL','EY','IB','IL','IN','IE', 'EI','ER')
 
BEGIN
SET @primary = @primary + 'K'
SET @secondary = @secondary + 'J'
SET @current = @current + 2
END
ELSE
BEGIN
IF (substring(@original, @current + 1, 2) = 'ER' -- -ger-, -gy-
OR @strnext1 = 'Y'
 
AND substring(@original, 1, 6) NOT in ('DANGER','RANGER','MANGER')
AND @strprev1 NOT IN ('E', 'I')
AND substring(@original, @current - 1, 3) NOT in ('RGY','OGY')
BEGIN
SET @primary = @primary + 'K'
SET @secondary = @secondary + 'J'
SET @current = @current + 2
END
ELSE
BEGIN
IF @strnext1 IN ('E','I','Y') -- italian e.g. 'biaggi'
OR substring(@original, @current -1, 4) in ('AGGI','OGGI')
BEGIN
IF (substring(@original, 1, 4) in ('VAN ', 'VON ') -- obvious germanic
OR substring(@original, 1, 3) = 'SCH'
 
OR substring(@original, @current + 1, 2) = 'ET'
BEGIN
SET @primary = @primary + 'K'
SET @secondary = @secondary + 'K'
END
ELSE
BEGIN
-- always soft IF french ending
IF substring(@original, @current + 1, 4) = 'IER '
BEGIN
SET @primary = @primary + 'J'
SET @secondary = @secondary + 'J'
END
ELSE
BEGIN
SET @primary = @primary + 'J'
SET @secondary = @secondary + 'K'
END
END
SET @current = @current + 2
END
ELSE
BEGIN -- other options exausted call it k sound
SET @primary = @primary + 'K'
SET @secondary = @secondary + 'K'
IF (@strnext1 = 'G')
SET @current = @current + 2
ELSE
SET @current = @current + 1
END
END
END
END
END
END
END
ELSE

IF @strcur1 = 'H'
BEGIN
IF (@current = 1 -- ONLY keep if first & before vowel or btw. 2 vowels
OR @strprev1 IN ('A', 'E', 'I', 'O', 'U', 'Y')
 
AND @strnext1 IN ('A', 'E', 'I', 'O', 'U', 'Y')
BEGIN
SET @primary = @primary + 'H'
SET @secondary = @secondary + 'H'
SET @current = @current + 2
END
ELSE
SET @current = @current + 1
END
ELSE

IF @strcur1 = 'J'
BEGIN
IF substring(@original, @current, 4) = 'JOSE' -- obvious spanish, 'jose', 'san jacinto'
OR substring(@original, 1, 4) = 'SAN '
BEGIN
IF (@current = 1
AND substring(@original, @current + 4, 1) = ' '
 
OR substring(@original, 1, 4) = 'SAN '
BEGIN
SET @primary = @primary + 'H'
SET @secondary = @secondary + 'H'
END
ELSE
BEGIN
SET @primary = @primary + 'J'
SET @secondary = @secondary + 'H'
END

SET @current = @current + 1
END
ELSE
BEGIN
IF @current = 1
BEGIN
SET @primary = @primary + 'J' -- Yankelovich/Jankelowicz
SET @secondary = @secondary + 'A'
SET @current = @current + 1
END
ELSE
BEGIN
IF @strprev1 IN ('A', 'E', 'I', 'O', 'U', 'Y') -- spanish pron. OF .e.g. 'bajador'
AND @SlavoGermanic = 0
AND @strnext1 IN ('A','O')
BEGIN
SET @primary = @primary + 'J'
SET @secondary = @secondary + 'H'
SET @current = @current + 1
END
ELSE
BEGIN
IF (@current = @last)
BEGIN
SET @primary = @primary + 'J'
SET @secondary = @secondary + ''
SET @current = @current + 1
END
ELSE
BEGIN
IF @strnext1 IN ('L','T','K','S','N','M','B','Z')
AND @strprev1 NOT IN ('S','K','L')
BEGIN
SET @primary = @primary + 'J'
SET @secondary = @secondary + 'J'
SET @current = @current + 1
END
ELSE
BEGIN
IF (@strnext1 = 'J') -- it could happen
SET @current = @current + 2
ELSE
SET @current = @current + 1
END
END
END
END
END
END
ELSE

IF @strcur1 = 'K'
BEGIN
SET @primary = @primary + 'K'
SET @secondary = @secondary + 'K'

IF (@strnext1 = 'K')
SET @current = @current + 2
ELSE
SET @current = @current + 1
END
ELSE

IF @strcur1 = 'L'
BEGIN
IF (@strnext1 = 'L')
BEGIN
IF (@current = (@length - 3) -- spanish e.g. 'cabrillo', 'gallegos'
AND substring(@original, @current - 1, 4) in ('ILLO','ILLA','ALLE')
 
OR ((substring(@original, @last - 1, 2) in ('AS','OS')
OR substring(@original, @last, 1) in ('A','O')
 
AND substring(@original, @current - 1, 4) = 'ALLE'
 
SET @primary = @primary + 'L' --Alternate IS silent
ELSE
BEGIN
SET @primary = @primary + 'L'
SET @secondary = @secondary + 'L'
END
SET @current = @current + 2
END
ELSE
BEGIN
SET @current = @current + 1
SET @primary = @primary + 'L'
SET @secondary = @secondary + 'L'
END
END
ELSE

IF @strcur1 = 'M'
BEGIN
SET @primary = @primary + 'M'
SET @secondary = @secondary + 'M'

IF substring(@original, @current - 1, 3) = 'UMB'
AND (@current + 1 = @last
OR substring(@original, @current + 2, 2) = 'ER'
  -- 'dumb', 'thumb'
OR @strnext1 = 'M'
SET @current = @current + 2
ELSE
SET @current = @current + 1
END
ELSE

IF @strcur1 IN ('N','Ñ')
BEGIN
SET @primary = @primary + 'N'
SET @secondary = @secondary + 'N'

IF @strnext1 IN ('N','Ñ')
SET @current = @current + 2
ELSE
SET @current = @current + 1
END
ELSE

IF @strcur1 = 'P'
BEGIN
IF (@strnext1 = 'H')
BEGIN
SET @current = @current + 2
SET @primary = @primary + 'F'
SET @secondary = @secondary + 'F'
END
ELSE
BEGIN
-- also account FOR 'campbell' AND 'raspberry'
IF @strnext1 IN ('P','B')
SET @current = @current + 2
ELSE
BEGIN
SET @current = @current + 1
SET @primary = @primary + 'P'
SET @secondary = @secondary + 'P'
END
END
END
ELSE

IF @strcur1 = 'Q'
BEGIN
SET @primary = @primary + 'K'
SET @secondary = @secondary + 'K'

IF (@strnext1 = 'Q')
SET @current = @current + 2
ELSE
SET @current = @current + 1
END
ELSE

IF @strcur1 = 'R'
BEGIN
IF @current = @last -- french e.g. 'rogier', but exclude 'hochmeier'
AND @SlavoGermanic = 0
AND substring(@original, @current - 2, 2) = 'IE'
AND substring(@original, @current - 4, 2) NOT in ('ME','MA')
SET @secondary = @secondary + 'R' --set @primary = @primary + ''
ELSE
BEGIN
SET @primary = @primary + 'R'
SET @secondary = @secondary + 'R'
END

IF (@strnext1 = 'R')
BEGIN
IF substring(@original, @current, 3) = 'RRI' --alternate Kerrigan, Corrigan
SET @secondary = @secondary + 'R'

SET @current = @current + 2
END
ELSE
SET @current = @current + 1
END
ELSE

IF @strcur1 = 'S'
BEGIN
IF substring(@original, @current - 1, 3) in ('ISL','YSL') -- special cases 'island', 'isle', 'carlisle', 'carlysle'
SET @current = @current + 1 --silent s
ELSE
BEGIN
IF substring(@original, @current, 2) = 'SH'
BEGIN
-- germanic
IF substring(@original, @current + 1, 4) in ('HEIM','HOEK','HOLM','HOLZ')
BEGIN
SET @primary = @primary + 'S'
SET @secondary = @secondary + 'S'
END
ELSE
BEGIN
SET @primary = @primary + 'X'
SET @secondary = @secondary + 'X'
END

SET @current = @current + 2
END
ELSE
BEGIN




-- italian & armenian
IF substring(@original, @current, 3) in ('SIO','SIA')
OR substring(@original, @current, 4) in ('SIAN')
BEGIN
IF @SlavoGermanic = 0
BEGIN
SET @primary = @primary + 'S'
SET @secondary = @secondary + 'X'
END
ELSE
BEGIN
SET @primary = @primary + 'S'
SET @secondary = @secondary + 'S'
END

SET @current = @current + 3
END
ELSE
BEGIN
IF (@current = 1 -- german & anglicisations, e.g. 'smith' match 'schmidt', 'snider' match 'schneider'
AND @strnext1 IN ('M','N','L','W') -- also, -sz- in slavic language altho in hungarian it IS pronounced 's'
 
OR @strnext1 = 'Z'
BEGIN
SET @primary = @primary + 'S'
SET @secondary = @secondary + 'X'

IF @strnext1 = 'Z'
SET @current = @current + 2
ELSE
SET @current = @current + 1
END
ELSE
BEGIN
IF substring(@original, @current, 2) = 'SC'
BEGIN
IF substring(@original, @current + 2, 1) = 'H' -- Schlesinger's RULE
BEGIN
IF substring(@original, @current + 3, 2) in ('OO','ER','EN','UY','ED','EM') -- dutch origin, e.g. 'school', 'schooner'
BEGIN
IF substring(@original, @current + 3, 2) in ('ER','EN') -- 'schermerhorn', 'schenker'
BEGIN
SET @primary = @primary + 'X'
SET @secondary = @secondary + 'SK'
END
ELSE
BEGIN
SET @primary = @primary + 'SK'
SET @secondary = @secondary + 'SK'
END

SET @current = @current + 3
END
ELSE
BEGIN
IF @current = 1
AND substring(@original, 3,1) NOT in ('A', 'E', 'I', 'O', 'U', 'Y')
AND substring(@original, @current + 3, 1) <> 'W'
BEGIN
SET @primary = @primary + 'X'
SET @secondary = @secondary + 'S'
END
ELSE
BEGIN
SET @primary = @primary + 'X'
SET @secondary = @secondary + 'X'
END

SET @current = @current + 3
END
END
ELSE
BEGIN
IF substring(@original, @current + 2, 1) in ('I','E','Y')
BEGIN
SET @primary = @primary + 'S'
SET @secondary = @secondary + 'S'
END
ELSE
BEGIN
SET @primary = @primary + 'SK'
SET @secondary = @secondary + 'SK'
END
SET @current = @current + 3
END
END
ELSE
BEGIN
IF @current = 1 -- special CASE 'sugar-'
AND substring(@original, @current, 5) = 'SUGAR'
BEGIN
SET @primary = @primary + 'X'
SET @secondary = @secondary + 'S'
SET @current = @current + 1
END
ELSE
BEGIN
IF @current = @last -- french e.g. 'resnais', 'artois'
AND substring(@original, @current - 2, 2) in ('AI','OI')
SET @secondary = @secondary + 'S' --set @primary = @primary + ''
ELSE
BEGIN
SET @primary = @primary + 'S'
SET @secondary = @secondary + 'S'
END

IF @strnext1 IN ('S','Z')
SET @current = @current + 2
ELSE
SET @current = @current + 1
END
END
END
END
END
END
END
ELSE

IF @strcur1 = 'T'
BEGIN
IF substring(@original, @current, 4) = 'TION'
BEGIN
SET @primary = @primary + 'X'
SET @secondary = @secondary + 'X'
SET @current = @current + 3
END
ELSE
IF substring(@original, @current, 3) in ('TIA','TCH')
BEGIN
SET @primary = @primary + 'X'
SET @secondary = @secondary + 'X'
SET @current = @current + 3
END
ELSE
IF substring(@original, @current, 2) = 'TH'
OR substring(@original, @current, 3) = 'TTH'
BEGIN
IF substring(@original, @current + 2, 2) in ('OM','AM') -- special CASE 'thomas', 'thames' or germanic
OR substring(@original, 0, 4) in ('VAN ','VON ')
OR substring(@original, 0, 3) = 'SCH'
BEGIN
SET @primary = @primary + 'T'
SET @secondary = @secondary + 'T'
END
ELSE
BEGIN
SET @primary = @primary + '0'
SET @secondary = @secondary + 'T'
END
SET @current = @current + 2
END
ELSE
BEGIN
IF @strnext1 IN ('T','D')
BEGIN
SET @current = @current + 2
SET @primary = @primary + 'T'
SET @secondary = @secondary + 'T'
END
ELSE
BEGIN
SET @current = @current + 1
SET @primary = @primary + 'T'
SET @secondary = @secondary + 'T'
END
END
END
ELSE

IF @strcur1 = 'V'
IF (@strnext1 = 'V')
SET @current = @current + 2
ELSE
BEGIN
SET @current = @current + 1
SET @primary = @primary + 'F'
SET @secondary = @secondary + 'F'
END
ELSE

IF @strcur1 = 'W'
BEGIN
-- can also be IN middle OF word
IF substring(@original, @current, 2) = 'WR'
BEGIN
SET @primary = @primary + 'R'
SET @secondary = @secondary + 'R'
SET @current = @current + 2
END
ELSE
IF @current = 1
AND (@strnext1 IN ('A', 'E', 'I', 'O', 'U', 'Y')
OR substring(@original, @current, 2) = 'WH'
 
BEGIN
IF @strnext1 IN ('A', 'E', 'I', 'O', 'U', 'Y') -- Wasserman should match Vasserman
BEGIN
SET @primary = @primary + 'A'
SET @secondary = @secondary + 'F'
SET @current = @current + 1
END
ELSE
BEGIN
SET @primary = @primary + 'A' -- need Uomo TO match Womo
SET @secondary = @secondary + 'A'
SET @current = @current + 1
END
END
ELSE
IF (@current = @last -- Arnow should match Arnoff
AND @strprev1 IN ('A', 'E', 'I', 'O', 'U', 'Y')
 
OR substring(@original, @current - 1, 5) in ('EWSKI','EWSKY','OWSKI','OWSKY')
OR substring(@original, 0, 3) = 'SCH'
BEGIN
SET @secondary = @secondary + 'F' --set @primary = @primary + ''
SET @current = @current + 1
END
ELSE
IF substring(@original, @current, 4) in ('WICZ','WITZ') -- polish e.g. 'filipowicz'
BEGIN
SET @primary = @primary + 'TS'
SET @secondary = @secondary + 'FX'
SET @current = @current + 4
END
ELSE
SET @current = @current + 1 -- ELSE skip it
END
ELSE

IF @strcur1 = 'X'
BEGIN
IF NOT (@current = @last -- french e.g. breaux
AND (substring(@original, @current - 3, 3) in ('IAU', 'EAU')
OR substring(@original, @current - 2, 2) in ('AU', 'OU')
 
 
BEGIN
SET @primary = @primary + 'KS'
SET @secondary = @secondary + 'KS'
END --else skip it

IF @strnext1 IN ('C','X')
SET @current = @current + 2
ELSE
SET @current = @current + 1
END
ELSE

IF @strcur1 = 'Z'
BEGIN
IF (@strnext1 = 'Z')
BEGIN
SET @primary = @primary + 'S'
SET @secondary = @secondary + 'S'
SET @current = @current + 2
END
ELSE
BEGIN
IF (@strnext1 = 'H') -- chinese pinyin e.g. 'zhao'
BEGIN
SET @primary = @primary + 'J'
SET @secondary = @secondary + 'J'
SET @current = @current + 2
END
ELSE
BEGIN
IF (substring(@original, @current + 1, 2) in ('ZO', 'ZI', 'ZA'))
OR (@SlavoGermanic = 1
AND (@current > 1
AND @strprev1 <> 'T'
 
 
BEGIN
SET @primary = @primary + 'S'
SET @secondary = @secondary + 'TS'
END
ELSE
BEGIN
SET @primary = @primary + 'S'
SET @secondary = @secondary + 'S'
END
END
SET @current = @current + 1
END
END
ELSE
SET @current = @current + 1
END
RETURN cast(@primary as char(10)) + cast(@secondary as char(10))
end



So long, and thanks for all the fish,

Russell Shilling, MCDBA, MCSA 2K3, MCSE 2K3
JuanBob
JuanBob
Ten Centuries
Ten Centuries (1K reputation)Ten Centuries (1K reputation)Ten Centuries (1K reputation)Ten Centuries (1K reputation)Ten Centuries (1K reputation)Ten Centuries (1K reputation)Ten Centuries (1K reputation)Ten Centuries (1K reputation)

Group: General Forum Members
Points: 1042 Visits: 269
Thank you for the info! I did find this code in multiple places, but no instructions or direction with what to do with it. I'll keep digging, but I appreciate the feedback!
Russell Shilling
Russell Shilling
UDP Broadcaster
UDP Broadcaster (1.5K reputation)UDP Broadcaster (1.5K reputation)UDP Broadcaster (1.5K reputation)UDP Broadcaster (1.5K reputation)UDP Broadcaster (1.5K reputation)UDP Broadcaster (1.5K reputation)UDP Broadcaster (1.5K reputation)UDP Broadcaster (1.5K reputation)

Group: General Forum Members
Points: 1468 Visits: 209

I don't think you'll find a tutorial that will meet your exact needs. Are you hoping to find something you can cut-n-paste that will de-dupe your DB?

Perhaps you should contract with someone to do this for you?



So long, and thanks for all the fish,

Russell Shilling, MCDBA, MCSA 2K3, MCSE 2K3
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search