Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

Partial match & Update Expand / Collapse
Author
Message
Posted Wednesday, July 3, 2013 11:55 AM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Thursday, April 24, 2014 8:19 PM
Points: 56, Visits: 240
I need update the AppID in table1 by comparing Table2. the data looks like below. Here the problem is AppNames in the both tables are not exactly same but technically they are same with partial match.

Table1

AppID||||Appname
Null ||||CSO Services Operations
Null |||| CSP
Null ||||AAC Claims Systems





Table 2

AppID|||||Appname
1 |||||Corporate Services Operations(CSO)
2 |||||Credit Servicing Portal(CSP)
3 |||||American Assuarance Company (AAC) Claims Systems


Please let me know how to get rid of this task.

thanks
Post #1470184
Posted Wednesday, July 3, 2013 12:12 PM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Yesterday @ 1:24 PM
Points: 12,906, Visits: 31,984
you are going to have to cleanup the data, it looks to me;
in your example, using charindex, you only get a single match.

unless you can guarantee the three letter abbreviations won't return false matches ("COR" would find "Google Corporation", for example), i don't see anything to reliably join the data.



/*
AppID Appname AppID Appname
Null CSP 2 Credit Servicing Portal(CSP)
*/
;With Table1([AppID],[Appname])
AS
(
SELECT 'Null','CSO Services Operations' UNION ALL
SELECT 'Null','CSP' UNION ALL
SELECT 'Null','AAC Claims Systems'
),Table2([AppID],[Appname])
AS
(
SELECT '1','Corporate Services Operations(CSO)' UNION ALL
SELECT '2','Credit Servicing Portal(CSP)' UNION ALL
SELECT '3','American Assuarance Company (AAC) Claims Systems'
)
SELECT *
FROM Table1
INNER JOIN Table2
ON CHARINDEX(Table1.Appname,Table2.Appname) > 0



Lowell

--There is no spoon, and there's no default ORDER BY in sql server either.
Actually, Common Sense is so rare, it should be considered a Superpower. --my son
Post #1470188
Posted Wednesday, July 3, 2013 12:15 PM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Thursday, April 24, 2014 8:19 PM
Points: 56, Visits: 240
Hi Lowell,

Thanks for your pormpt reply. But i do have 1000 records to macth in both tables. So, we have to put all the thousand values in the way you mentioned?? or do we have any other way to do it

Please advise me

Thanks
Post #1470192
Posted Wednesday, July 3, 2013 1:24 PM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Yesterday @ 1:24 PM
Points: 12,906, Visits: 31,984
DBA12345 (7/3/2013)
Hi Lowell,

Thanks for your pormpt reply. But i do have 1000 records to macth in both tables. So, we have to put all the thousand values in the way you mentioned?? or do we have any other way to do it

Please advise me

Thanks


well the obvious fix is hard work. you've inherited an app that didn't normalize these fields, and now you need to sanitize and normalize it after the fact...it's not fun.

1. you have to create a master table for the possible values of AppName, with an identity primary key/unique constraint on Appname., making sure to add an acronym for versatility matching.

my wild guess:

CREATE TABLE [dbo].[APPNAMELOOKUP] (
[APPNAMEID] INT IDENTITY(1,1) NOT NULL,
APPACRONYM VARCHAR(3),
[APPNAME] VARCHAR(100) NOT NULL,
CONSTRAINT [PK__AppNameLookup__11C2C212] PRIMARY KEY CLUSTERED (AppNameID),
CONSTRAINT [UQ__AppNameLookup__12B6E64B] UNIQUE NONCLUSTERED (AppName))


2. you have to go through all the descriptions and select only ONE per "real" appname.

3.Whatever application that allows users to freetext an appName must now be modified to select the value from a drop down menu instead.

4. Your existing tables now need new column , which now point to the new table as a foreign key, instead of having descriptions

5. that means a migration by analysis to update the new column to the "correct" value for the Designated Appname. This would probably be a series of scripts, That you custom write yourself, that updates based on comparing charindex like above, then trying for partial matches, and then by 3 letter acronyms, and then finally, for any that did not get caught in the above scripts, a stack of manually assigning records based on eyeball analysis.


Lowell

--There is no spoon, and there's no default ORDER BY in sql server either.
Actually, Common Sense is so rare, it should be considered a Superpower. --my son
Post #1470207
Posted Wednesday, July 3, 2013 2:17 PM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Yesterday @ 5:29 PM
Points: 3,513, Visits: 7,565
I'm with Lowell on cleaning up the data.
However, this might help you, but you still have to review the matches given.

;With Table1([AppID],[Appname])
AS
(
SELECT 'Null','CSO Services Operations' UNION ALL
SELECT 'Null','CSP' UNION ALL
SELECT 'Null','AAC Claims Systems'
),Table2([AppID],[Appname])
AS
(
SELECT '1','Corporate Services Operations(CSO)' UNION ALL
SELECT '2','Credit Servicing Portal(CSP)' UNION ALL
SELECT '3','American Assuarance Company (AAC) Claims Systems'
)
SELECT *
FROM Table1
JOIN Table2 ON Table1.Appname LIKE SUBSTRING( Table2.Appname, CHARINDEX( '(',Table2.Appname) + 1, CHARINDEX( ')',Table2.Appname) - CHARINDEX( '(',Table2.Appname) - 1) + '%'




Luis C.
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

Forum Etiquette: How to post data/code on a forum to get the best help
Post #1470225
Posted Wednesday, July 3, 2013 4:20 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Yesterday @ 3:51 PM
Points: 36,959, Visits: 31,469
DBA12345 (7/3/2013)
I need update the AppID in table1 by comparing Table2. the data looks like below. Here the problem is AppNames in the both tables are not exactly same but technically they are same with partial match.

Table1

AppID||||Appname
Null ||||CSO Services Operations
Null |||| CSP
Null ||||AAC Claims Systems


Table 2

AppID|||||Appname
1 |||||Corporate Services Operations(CSO)
2 |||||Credit Servicing Portal(CSP)
3 |||||American Assuarance Company (AAC) Claims Systems


Please let me know how to get rid of this task.

thanks

Are all of the AppNames in Table1 absolutely guaranteed to contain the abbreviation contained in the parenthesis of the AppNames in Table2 and are all the AppNames in Table2 absolutely guaranteed to contain the abbreviation in parenthesis?


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1470261
Posted Wednesday, July 3, 2013 7:17 PM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Wednesday, August 20, 2014 4:14 AM
Points: 3,618, Visits: 5,254
Lot's of assumptions here including all of Jeff's, Lowell's that you don't have an abbreviation like COR that matches to Corporation and that also there are not extra parens in Table2, but here's a shot at it:

WITH Table1([AppID],[Appname]) AS (
SELECT 'Null','CSO Services Operations' UNION ALL
SELECT 'Null',' CSP' UNION ALL
SELECT 'Null','AAC Claims Systems'
)
,Table2([AppID],[Appname]) AS (
SELECT '1','Corporate Services Operations(CSO)' UNION ALL
SELECT '2','Credit Servicing Portal(CSP)' UNION ALL
SELECT '3','American Assuarance Company (AAC) Claims Systems'
)
SELECT a.*, e.*
FROM Table2 a
CROSS APPLY dbo.PatternSplitCM(a.Appname, '[()A-Za-z]') b
CROSS APPLY (SELECT RIGHT(Item, LEN(Item)-CHARINDEX('(', Item))) c(Item2)
CROSS APPLY (SELECT '%' + STUFF(Item2, LEN(Item2), 1, '') + '%') d(Item3)
INNER JOIN Table1 e ON e.AppName LIKE Item3
WHERE [Matched]=1 AND CHARINDEX('(', Item) > 0 AND CHARINDEX(')', Item) > 0


PatternSplitCM can be found in the 4th article in my signature links.

I wouldn't want to be responsible for this running in Prod though without a whole lot of real thorough testing through all the possible cases. Still, it might be possible to make it work if you could apply additional assumptions like that the abbreviation always starts in the first non-blank character of Table1.



My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!

My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.


Need to UNPIVOT? Why not CROSS APPLY VALUES instead?
Since random numbers are too important to be left to chance, let's generate some!
Learn to understand recursive CTEs by example.
Splitting strings based on patterns can be fast!
Post #1470279
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse