|
|
|
Forum Newbie
      
Group: General Forum Members
Last Login: Wednesday, July 23, 2008 8:59 AM
Points: 4,
Visits: 8
|
|
Background I have inherited a system which has a table that stores CompanyName , and a column called ‘soundexe’; soundexe contains letters which are then compared to other company names using Like.
CREATE TABLE [dbo].[Commission_Soundex] ( [SoundID] [int] IDENTITY (1, 1) NOT NULL , [Soundexe] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL , [CompanyName] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL , [LeftOn] [bit] NOT NULL CONSTRAINT [DF_Commission_Soundex_LeftOn] DEFAULT (1), [RightOn] [bit] NOT NULL CONSTRAINT [DF_Commission_Soundex_RightOn] DEFAULT (1), [NotLike] [bit] NOT NULL CONSTRAINT [DF_Commission_Soundex_BothOn] DEFAULT (1), [BothOff] [bit] NOT NULL CONSTRAINT [DF_Commission_Soundex_BothOff] DEFAULT (0), CONSTRAINT [PK_Commission_Soundex] PRIMARY KEY CLUSTERED
Here are a few rows of the table for you to consider: SoundID Soundexe CompanyName LeftOn RightOn NotLike BothOff 1 ABN ABM Ambro Vancouver 1 1 0 0 2 ABN ABN_AMRO 1 1 0 0 3 ACCENTU Accenture 1 1 0 0 4 AEGON AEGON 0 0 0 1 5 AG ED AG Edwards 1 1 0 0 6 AIC AIC Limited 0 1 0 0 7 AIG AIG 2006 0 1 0 0 8 TRIMA AIM TRIMARK 1 1 0 0 9 ALLIANCE CA Alliance Capital Management 1 1 0 0 10 ALLIANZ Allianz 0 0 0 1 11 ALLSTAT Allstate Financial 1 1 0 0
The Problem and Question My problem is this – I need to update this table with hundreds of companies and update this ‘soundexe’ column also. But I can not find the algorithm which was used to create it.
My hope is that someone will look at this data and tell me some wonderful function that would take the company name ‘AIM TRIMARK’ and convert it to ‘TRIMA’.
Thanks so much!!!
p.s. I have attached a spreadsheet of the table for those of you wanting more examples.
|
|
|
|
|
Hall of Fame
       
Group: General Forum Members
Last Login: Tuesday, January 29, 2013 10:54 AM
Points: 3,837,
Visits: 3,821
|
|
Uh, unless I don't quite understand your question, this is just a simple update statement.
DECLARE @Commission_Soundex TABLE ( [SoundID] [int] IDENTITY (1, 1) NOT NULL , [Soundexe] [varchar] (50) NULL , [CompanyName] [varchar] (50) NULL , [LeftOn] [bit] NOT NULL DEFAULT (1), [RightOn] [bit] NOT NULL DEFAULT (1), [NotLike] [bit] NOT NULL DEFAULT (1), [BothOff] [bit] NOT NULL DEFAULT (0) )
INSERT INTO @Commission_Soundex (SoundExe, CompanyName, LeftOn, RightOn, NotLike, BothOff) SELECT 'ABN', 'ABM Ambro Vancouver', 1, 1, 0, 0 UNION ALL SELECT 'ABN', 'ABN_AMRO', 1, 1, 0, 0 UNION ALL SELECT 'ACCENTU', 'Accenture', 1, 1, 0, 0 UNION ALL SELECT 'AEGON', 'AEGON', 0, 0, 0, 1 UNION ALL SELECT 'AG ED', 'AG Edwards', 1, 1, 0, 0 UNION ALL SELECT 'AIC', 'AIC Limited', 0, 1, 0, 0 UNION ALL SELECT 'AIG', 'AIG 2006', 0, 1, 0, 0 UNION ALL SELECT 'TRIMA', 'AIM TRIMARK', 1, 1, 0, 0 UNION ALL SELECT 'ALLIANCE CA', 'Alliance Capital Management', 1, 1, 0, 0 UNION ALL SELECT 'ALLIANZ', 'Allianz', 0, 0, 0, 1 UNION ALL SELECT 'ALLSTAT', 'Allstate Financial', 1, 1, 0, 0
SELECT * FROM @Commission_Soundex
UPDATE @Commission_Soundex SET CompanyName = 'TRIMA' WHERE CompanyName = 'AIM TRIMARK'
SELECT * FROM @Commission_Soundex
John Rowan
====================================================== ====================================================== Forum Etiquette: How to post data/code on a forum to get the best help - by Jeff Moden
|
|
|
|
|
Forum Newbie
      
Group: General Forum Members
Last Login: Wednesday, July 23, 2008 8:59 AM
Points: 4,
Visits: 8
|
|
Thanks, but I must not have explained myself well. I don't know what the algorithm is that converts the 'CompanyName' to 'Soundexe'. For example, CompanyName = 'Accenture', and Soundexe = 'ACCENTU'.
At first I thought it was simply a 'soundex' function - but it is not. I really don't think that anyone could crack this, but I am hopeful!!!
|
|
|
|
|
Hall of Fame
       
Group: General Forum Members
Last Login: Tuesday, January 29, 2013 10:54 AM
Points: 3,837,
Visits: 3,821
|
|
|
|
|
|
Forum Newbie
      
Group: General Forum Members
Last Login: Wednesday, July 23, 2008 8:59 AM
Points: 4,
Visits: 8
|
|
I think that the Soundexe column is derived from the company name because in a Stored Procedure used later, the following SQL is executed:
SELECT DISTINCT C.COMPANYNAME, CS.SOUNDEXE, D.COMPANYNAME, D.SALESREPID, YEAR(ORDERDATE) AS YR FROM COMMISSION_SOUNDEX CS LEFT JOIN DISCOUNT D ON CS.COMPANYNAME=D.COMPANYNAME JOIN COMPANY C ON C.COMPANYNAME LIKE ('%' + CS.SOUNDEXE + '%') JOIN CUSTOMER CU ON C.COMPANYID=CU.COMPANYID LEFT JOIN ORDERS O ON CU.CUSTOMERID=O.CUSTOMERID WHERE CS.LEFTON=0 AND CS.RIGHTON=1 AND ORDERDATE IS NOT NULL
There are several iterations of this logic with the following changes:
COMPANY C ON C.COMPANYNAME LIKE ('%' + CS.SOUNDEXE) JOIN WHERE CS.LEFTON=1 AND CS.RIGHTON=0 and so on. (Notice the % is only on the Left side now, and the query is looking for 'lefton' = 1.)
So, it is being used to find companies which are spelled similarly. The newly found 'companies' are placed in a temp table which must be reviewed by a human to determine if the newly found company is indeed a branch of the company. (All of this is used to calculate commissions.)
|
|
|
|
|
Hall of Fame
       
Group: General Forum Members
Last Login: Tuesday, January 29, 2013 10:54 AM
Points: 3,837,
Visits: 3,821
|
|
|
|
|
|
Forum Newbie
      
Group: General Forum Members
Last Login: Wednesday, July 23, 2008 8:59 AM
Points: 4,
Visits: 8
|
|
| I don't know how it was intially entered into the table. All I know is that the table has not been updated since January, 2007. I have searched through SQL Server to find references to the table to no avail. So, I really have no idea how is should be updated. It does appear to be either a mass load or a program that created the data however. There is actually a 'last_updated column in the database and they have the same datetime in them.
|
|
|
|
|
Hall of Fame
       
Group: General Forum Members
Last Login: Tuesday, January 29, 2013 10:54 AM
Points: 3,837,
Visits: 3,821
|
|
Without documentation, it's hard to say. It does appear that the SoundExe values are all comprised of the left most X characters of the name, although I cannot see a pattern as to how many values are used for each name. I don't think there is an algorithm that will reproduce the values.
Since you cannot find code that inserts the data and it appears to be static, I would suggest finding all of the code that queries the tables and go through it until you have a very good understanding of how the data is used. Then use your knowledge of the data access rules to manually change the CompanyName and SoundExe values. This is the only way I can think of to ensure that once you change the values, your queries will still return the expected results.
This is probably not the majic answer that you were looking for, but unless someone else chimes in here, I'm out of ideas. You should also find the guys who originally designed the system and pummel them upon the neck and head :D.
John Rowan
====================================================== ====================================================== Forum Etiquette: How to post data/code on a forum to get the best help - by Jeff Moden
|
|
|
|
|
SSCrazy
      
Group: General Forum Members
Last Login: Tuesday, August 10, 2010 5:07 AM
Points: 2,732,
Visits: 23,078
|
|
I would guess that some developer thought he could wire a better SoundEx and something was coded into a front end application that inserted the data. If I were you, I would search all of the source code repositories at your company for the string "soundexe" and hope that something comes back.
There is also the chance the the data was imported from some source and this information came with it.
|
|
|
|