• Here is my first attempt to make soundex more selective:

    Algorithm sqlserver uses:

    Replace consonants with digits as follows (but do not change the first letter):

    b, f, p, v => 1

    c, g, j, k, q, s, x, z => 2

    d, t => 3

    l => 4

    m, n => 5

    r => 6

    Collapse adjacent identical digits into a single digit of that value.

    Remove all non-digits after the first letter.

    Return the starting letter and the first three remaining digits.

    If needed, append zeroes to make it a letter and three digits.

    Lets extend this to 5 in length to make it more selective.

    On lastname data I see a selectivity increase from 1 in 3000 to 1 in 10000.

    Function contains no error checking and is not optimized.

    drop function dbo.soundex5;

    go

    create function Soundex5( @name varchar(30))

    returns varchar(5)

    as

    begin

    declare

    @out varchar(20),

    @i int,

    @vc varchar(1);

    set @name = @name + '00000';

    set @out = left(@name,1);

    set @i = 2;

    while @i <= len(@name)

    begin

    set @vc = substring(@name, @i, 1)

    set @out = @out +

    case

    when charindex(@vc, 'bfpv') > 0 then '1'

    when charindex(@vc, 'cgjkqsxz') > 0 then '2'

    when charindex(@vc, 'dt') > 0 then '3'

    when charindex(@vc, 'l') > 0 then '4'

    when charindex(@vc, 'mn') > 0 then '5'

    when charindex(@vc, 'r') > 0 then '6'

    when charindex(@vc, '0') > 0 then '0'

    else ''

    end;

    set @i = @i + 1;

    end

    -- ignore similar consecutive sounds

    set @out = REPLACE(@out,'11','1');

    set @out = REPLACE(@out,'22','2');

    set @out = REPLACE(@out,'33','3');

    set @out = REPLACE(@out,'44','4');

    set @out = REPLACE(@out,'55','5');

    set @out = REPLACE(@out,'66','6');

    return left(@out,5);

    end

    go