SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Soundex-Like functionality


Soundex-Like functionality

Author
Message
kchrysler
kchrysler
Forum Newbie
Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)

Group: General Forum Members
Points: 4 Visits: 8
Background
I have inherited a system which has a table that stores CompanyName , and a column called ‘soundexe’; soundexe contains letters which are then compared to other company names using Like.

CREATE TABLE [dbo].[Commission_Soundex] (
[SoundID] [int] IDENTITY (1, 1) NOT NULL ,
[Soundexe] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[CompanyName] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[LeftOn] [bit] NOT NULL CONSTRAINT [DF_Commission_Soundex_LeftOn] DEFAULT (1),
[RightOn] [bit] NOT NULL CONSTRAINT [DF_Commission_Soundex_RightOn] DEFAULT (1),
[NotLike] [bit] NOT NULL CONSTRAINT [DF_Commission_Soundex_BothOn] DEFAULT (1),
[BothOff] [bit] NOT NULL CONSTRAINT [DF_Commission_Soundex_BothOff] DEFAULT (0),
CONSTRAINT [PK_Commission_Soundex] PRIMARY KEY CLUSTERED

Here are a few rows of the table for you to consider:
SoundID Soundexe CompanyName LeftOn RightOn NotLike BothOff
1 ABN ABM Ambro Vancouver 1 1 0 0
2 ABN ABN_AMRO 1 1 0 0
3 ACCENTU Accenture 1 1 0 0
4 AEGON AEGON 0 0 0 1
5 AG ED AG Edwards 1 1 0 0
6 AIC AIC Limited 0 1 0 0
7 AIG AIG 2006 0 1 0 0
8 TRIMA AIM TRIMARK 1 1 0 0
9 ALLIANCE CA Alliance Capital Management 1 1 0 0
10 ALLIANZ Allianz 0 0 0 1
11 ALLSTAT Allstate Financial 1 1 0 0

The Problem and Question
My problem is this – I need to update this table with hundreds of companies and update this ‘soundexe’ column also. But I can not find the algorithm which was used to create it.

My hope is that someone will look at this data and tell me some wonderful function that would take the company name ‘AIM TRIMARK’ and convert it to ‘TRIMA’.

Thanks so much!!!

p.s. I have attached a spreadsheet of the table for those of you wanting more examples.
Attachments
CompanyNames.xls (26 views, 56.00 KB)
John Rowan
John Rowan
SSChampion
SSChampion (13K reputation)SSChampion (13K reputation)SSChampion (13K reputation)SSChampion (13K reputation)SSChampion (13K reputation)SSChampion (13K reputation)SSChampion (13K reputation)SSChampion (13K reputation)

Group: General Forum Members
Points: 13918 Visits: 4588
Uh, unless I don't quite understand your question, this is just a simple update statement.


DECLARE @Commission_Soundex TABLE (
[SoundID] [int] IDENTITY (1, 1) NOT NULL ,
[Soundexe] [varchar] (50) NULL ,
[CompanyName] [varchar] (50) NULL ,
[LeftOn] [bit] NOT NULL DEFAULT (1),
[RightOn] [bit] NOT NULL DEFAULT (1),
[NotLike] [bit] NOT NULL DEFAULT (1),
[BothOff] [bit] NOT NULL DEFAULT (0)
)

INSERT INTO @Commission_Soundex (SoundExe, CompanyName, LeftOn, RightOn, NotLike, BothOff)
SELECT 'ABN', 'ABM Ambro Vancouver', 1, 1, 0, 0 UNION ALL
SELECT 'ABN', 'ABN_AMRO', 1, 1, 0, 0 UNION ALL
SELECT 'ACCENTU', 'Accenture', 1, 1, 0, 0 UNION ALL
SELECT 'AEGON', 'AEGON', 0, 0, 0, 1 UNION ALL
SELECT 'AG ED', 'AG Edwards', 1, 1, 0, 0 UNION ALL
SELECT 'AIC', 'AIC Limited', 0, 1, 0, 0 UNION ALL
SELECT 'AIG', 'AIG 2006', 0, 1, 0, 0 UNION ALL
SELECT 'TRIMA', 'AIM TRIMARK', 1, 1, 0, 0 UNION ALL
SELECT 'ALLIANCE CA', 'Alliance Capital Management', 1, 1, 0, 0 UNION ALL
SELECT 'ALLIANZ', 'Allianz', 0, 0, 0, 1 UNION ALL
SELECT 'ALLSTAT', 'Allstate Financial', 1, 1, 0, 0

SELECT * FROM @Commission_Soundex

UPDATE @Commission_Soundex
SET CompanyName = 'TRIMA'
WHERE CompanyName = 'AIM TRIMARK'

SELECT * FROM @Commission_Soundex



John Rowan

======================================================
======================================================
Forum Etiquette: How to post data/code on a forum to get the best help - by Jeff Moden
kchrysler
kchrysler
Forum Newbie
Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)

Group: General Forum Members
Points: 4 Visits: 8
Thanks, but I must not have explained myself well. I don't know what the algorithm is that converts the 'CompanyName' to 'Soundexe'. For example, CompanyName = 'Accenture', and Soundexe = 'ACCENTU'.

At first I thought it was simply a 'soundex' function - but it is not. I really don't think that anyone could crack this, but I am hopeful!!!
John Rowan
John Rowan
SSChampion
SSChampion (13K reputation)SSChampion (13K reputation)SSChampion (13K reputation)SSChampion (13K reputation)SSChampion (13K reputation)SSChampion (13K reputation)SSChampion (13K reputation)SSChampion (13K reputation)

Group: General Forum Members
Points: 13918 Visits: 4588
What makes you think that the SoundExe value is derived from the CompanyName? What does the SoundExe values represent?

John Rowan

======================================================
======================================================
Forum Etiquette: How to post data/code on a forum to get the best help - by Jeff Moden
kchrysler
kchrysler
Forum Newbie
Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)

Group: General Forum Members
Points: 4 Visits: 8
I think that the Soundexe column is derived from the company name because in a Stored Procedure used later, the following SQL is executed:


SELECT DISTINCT C.COMPANYNAME, CS.SOUNDEXE, D.COMPANYNAME, D.SALESREPID, YEAR(ORDERDATE) AS YR
FROM COMMISSION_SOUNDEX CS LEFT JOIN
DISCOUNT D ON CS.COMPANYNAME=D.COMPANYNAME JOIN
COMPANY C ON C.COMPANYNAME LIKE ('%' + CS.SOUNDEXE + '%') JOIN
CUSTOMER CU ON C.COMPANYID=CU.COMPANYID LEFT JOIN
ORDERS O ON CU.CUSTOMERID=O.CUSTOMERID
WHERE CS.LEFTON=0 AND CS.RIGHTON=1
AND ORDERDATE IS NOT NULL

There are several iterations of this logic with the following changes:

COMPANY C ON C.COMPANYNAME LIKE ('%' + CS.SOUNDEXE) JOIN
WHERE CS.LEFTON=1 AND CS.RIGHTON=0
and so on. (Notice the % is only on the Left side now, and the query is looking for 'lefton' = 1.)

So, it is being used to find companies which are spelled similarly. The newly found 'companies' are placed in a temp table which must be reviewed by a human to determine if the newly found company is indeed a branch of the company. (All of this is used to calculate commissions.)
John Rowan
John Rowan
SSChampion
SSChampion (13K reputation)SSChampion (13K reputation)SSChampion (13K reputation)SSChampion (13K reputation)SSChampion (13K reputation)SSChampion (13K reputation)SSChampion (13K reputation)SSChampion (13K reputation)

Group: General Forum Members
Points: 13918 Visits: 4588
Yes, but this would appear to me to be more of a business rule. How does the data initially get inserted into the table?

John Rowan

======================================================
======================================================
Forum Etiquette: How to post data/code on a forum to get the best help - by Jeff Moden
kchrysler
kchrysler
Forum Newbie
Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)

Group: General Forum Members
Points: 4 Visits: 8
I don't know how it was intially entered into the table. All I know is that the table has not been updated since January, 2007. I have searched through SQL Server to find references to the table to no avail. So, I really have no idea how is should be updated. It does appear to be either a mass load or a program that created the data however. There is actually a 'last_updated column in the database and they have the same datetime in them.
John Rowan
John Rowan
SSChampion
SSChampion (13K reputation)SSChampion (13K reputation)SSChampion (13K reputation)SSChampion (13K reputation)SSChampion (13K reputation)SSChampion (13K reputation)SSChampion (13K reputation)SSChampion (13K reputation)

Group: General Forum Members
Points: 13918 Visits: 4588
Without documentation, it's hard to say. It does appear that the SoundExe values are all comprised of the left most X characters of the name, although I cannot see a pattern as to how many values are used for each name. I don't think there is an algorithm that will reproduce the values.

Since you cannot find code that inserts the data and it appears to be static, I would suggest finding all of the code that queries the tables and go through it until you have a very good understanding of how the data is used. Then use your knowledge of the data access rules to manually change the CompanyName and SoundExe values. This is the only way I can think of to ensure that once you change the values, your queries will still return the expected results.

This is probably not the majic answer that you were looking for, but unless someone else chimes in here, I'm out of ideas. You should also find the guys who originally designed the system and pummel them upon the neck and head BigGrin.

John Rowan

======================================================
======================================================
Forum Etiquette: How to post data/code on a forum to get the best help - by Jeff Moden
Michael Earl-395764
Michael Earl-395764
SSChampion
SSChampion (13K reputation)SSChampion (13K reputation)SSChampion (13K reputation)SSChampion (13K reputation)SSChampion (13K reputation)SSChampion (13K reputation)SSChampion (13K reputation)SSChampion (13K reputation)

Group: General Forum Members
Points: 13533 Visits: 23078
I would guess that some developer thought he could wire a better SoundEx and something was coded into a front end application that inserted the data. If I were you, I would search all of the source code repositories at your company for the string "soundexe" and hope that something comes back.

There is also the chance the the data was imported from some source and this information came with it.
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search