Click here to monitor SSC
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Partial match & Update


Partial match & Update

Author
Message
DBA12345
DBA12345
Valued Member
Valued Member (56 reputation)Valued Member (56 reputation)Valued Member (56 reputation)Valued Member (56 reputation)Valued Member (56 reputation)Valued Member (56 reputation)Valued Member (56 reputation)Valued Member (56 reputation)

Group: General Forum Members
Points: 56 Visits: 240
I need update the AppID in table1 by comparing Table2. the data looks like below. Here the problem is AppNames in the both tables are not exactly same but technically they are same with partial match.

Table1

AppID||||Appname
Null ||||CSO Services Operations
Null |||| CSP
Null ||||AAC Claims Systems





Table 2

AppID|||||Appname
1 |||||Corporate Services Operations(CSO)
2 |||||Credit Servicing Portal(CSP)
3 |||||American Assuarance Company (AAC) Claims Systems


Please let me know how to get rid of this task.

thanks
Lowell
Lowell
SSChampion
SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)

Group: General Forum Members
Points: 14957 Visits: 38959
you are going to have to cleanup the data, it looks to me;
in your example, using charindex, you only get a single match.

unless you can guarantee the three letter abbreviations won't return false matches ("COR" would find "Google Corporation", for example), i don't see anything to reliably join the data.




/*
AppID Appname AppID Appname
Null CSP 2 Credit Servicing Portal(CSP)
*/
;With Table1([AppID],[Appname])
AS
(
SELECT 'Null','CSO Services Operations' UNION ALL
SELECT 'Null','CSP' UNION ALL
SELECT 'Null','AAC Claims Systems'
),Table2([AppID],[Appname])
AS
(
SELECT '1','Corporate Services Operations(CSO)' UNION ALL
SELECT '2','Credit Servicing Portal(CSP)' UNION ALL
SELECT '3','American Assuarance Company (AAC) Claims Systems'
)
SELECT *
FROM Table1
INNER JOIN Table2
ON CHARINDEX(Table1.Appname,Table2.Appname) > 0



Lowell

--
help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

DBA12345
DBA12345
Valued Member
Valued Member (56 reputation)Valued Member (56 reputation)Valued Member (56 reputation)Valued Member (56 reputation)Valued Member (56 reputation)Valued Member (56 reputation)Valued Member (56 reputation)Valued Member (56 reputation)

Group: General Forum Members
Points: 56 Visits: 240
Hi Lowell,

Thanks for your pormpt reply. But i do have 1000 records to macth in both tables. So, we have to put all the thousand values in the way you mentioned?? or do we have any other way to do it

Please advise me

Thanks
Lowell
Lowell
SSChampion
SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)

Group: General Forum Members
Points: 14957 Visits: 38959
DBA12345 (7/3/2013)
Hi Lowell,

Thanks for your pormpt reply. But i do have 1000 records to macth in both tables. So, we have to put all the thousand values in the way you mentioned?? or do we have any other way to do it

Please advise me

Thanks


well the obvious fix is hard work. you've inherited an app that didn't normalize these fields, and now you need to sanitize and normalize it after the fact...it's not fun.

1. you have to create a master table for the possible values of AppName, with an identity primary key/unique constraint on Appname., making sure to add an acronym for versatility matching.

my wild guess:


CREATE TABLE [dbo].[APPNAMELOOKUP] (
[APPNAMEID] INT IDENTITY(1,1) NOT NULL,
APPACRONYM VARCHAR(3),
[APPNAME] VARCHAR(100) NOT NULL,
CONSTRAINT [PK__AppNameLookup__11C2C212] PRIMARY KEY CLUSTERED (AppNameID),
CONSTRAINT [UQ__AppNameLookup__12B6E64B] UNIQUE NONCLUSTERED (AppName))



2. you have to go through all the descriptions and select only ONE per "real" appname.

3.Whatever application that allows users to freetext an appName must now be modified to select the value from a drop down menu instead.

4. Your existing tables now need new column , which now point to the new table as a foreign key, instead of having descriptions

5. that means a migration by analysis to update the new column to the "correct" value for the Designated Appname. This would probably be a series of scripts, That you custom write yourself, that updates based on comparing charindex like above, then trying for partial matches, and then by 3 letter acronyms, and then finally, for any that did not get caught in the above scripts, a stack of manually assigning records based on eyeball analysis.

Lowell

--
help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

Luis Cazares
Luis Cazares
SSCrazy Eights
SSCrazy Eights (8.5K reputation)SSCrazy Eights (8.5K reputation)SSCrazy Eights (8.5K reputation)SSCrazy Eights (8.5K reputation)SSCrazy Eights (8.5K reputation)SSCrazy Eights (8.5K reputation)SSCrazy Eights (8.5K reputation)SSCrazy Eights (8.5K reputation)

Group: General Forum Members
Points: 8536 Visits: 18130
I'm with Lowell on cleaning up the data.
However, this might help you, but you still have to review the matches given.

;With Table1([AppID],[Appname])
AS
(
SELECT 'Null','CSO Services Operations' UNION ALL
SELECT 'Null','CSP' UNION ALL
SELECT 'Null','AAC Claims Systems'
),Table2([AppID],[Appname])
AS
(
SELECT '1','Corporate Services Operations(CSO)' UNION ALL
SELECT '2','Credit Servicing Portal(CSP)' UNION ALL
SELECT '3','American Assuarance Company (AAC) Claims Systems'
)
SELECT *
FROM Table1
JOIN Table2 ON Table1.Appname LIKE SUBSTRING( Table2.Appname, CHARINDEX( '(',Table2.Appname) + 1, CHARINDEX( ')',Table2.Appname) - CHARINDEX( '(',Table2.Appname) - 1) + '%'




Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?


How to post data/code on a forum to get the best help: Option 1 / Option 2
Jeff Moden
Jeff Moden
SSC-Forever
SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)

Group: General Forum Members
Points: 45104 Visits: 39912
DBA12345 (7/3/2013)
I need update the AppID in table1 by comparing Table2. the data looks like below. Here the problem is AppNames in the both tables are not exactly same but technically they are same with partial match.

Table1

AppID||||Appname
Null ||||CSO Services Operations
Null |||| CSP
Null ||||AAC Claims Systems


Table 2

AppID|||||Appname
1 |||||Corporate Services Operations(CSO)
2 |||||Credit Servicing Portal(CSP)
3 |||||American Assuarance Company (AAC) Claims Systems


Please let me know how to get rid of this task.

thanks

Are all of the AppNames in Table1 absolutely guaranteed to contain the abbreviation contained in the parenthesis of the AppNames in Table2 and are all the AppNames in Table2 absolutely guaranteed to contain the abbreviation in parenthesis?

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
Although they tell us that they want it real bad, our primary goal is to ensure that we dont actually give it to them that way.
Although change is inevitable, change for the better is not.
Just because you can do something in PowerShell, doesnt mean you should. Wink

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
dwain.c
dwain.c
SSCarpal Tunnel
SSCarpal Tunnel (4.3K reputation)SSCarpal Tunnel (4.3K reputation)SSCarpal Tunnel (4.3K reputation)SSCarpal Tunnel (4.3K reputation)SSCarpal Tunnel (4.3K reputation)SSCarpal Tunnel (4.3K reputation)SSCarpal Tunnel (4.3K reputation)SSCarpal Tunnel (4.3K reputation)

Group: General Forum Members
Points: 4255 Visits: 6431
Lot's of assumptions here including all of Jeff's, Lowell's that you don't have an abbreviation like COR that matches to Corporation and that also there are not extra parens in Table2, but here's a shot at it:


WITH Table1([AppID],[Appname]) AS (
SELECT 'Null','CSO Services Operations' UNION ALL
SELECT 'Null',' CSP' UNION ALL
SELECT 'Null','AAC Claims Systems'
)
,Table2([AppID],[Appname]) AS (
SELECT '1','Corporate Services Operations(CSO)' UNION ALL
SELECT '2','Credit Servicing Portal(CSP)' UNION ALL
SELECT '3','American Assuarance Company (AAC) Claims Systems'
)
SELECT a.*, e.*
FROM Table2 a
CROSS APPLY dbo.PatternSplitCM(a.Appname, '[()A-Za-z]') b
CROSS APPLY (SELECT RIGHT(Item, LEN(Item)-CHARINDEX('(', Item))) c(Item2)
CROSS APPLY (SELECT '%' + STUFF(Item2, LEN(Item2), 1, '') + '%') d(Item3)
INNER JOIN Table1 e ON e.AppName LIKE Item3
WHERE [Matched]=1 AND CHARINDEX('(', Item) > 0 AND CHARINDEX(')', Item) > 0




PatternSplitCM can be found in the 4th article in my signature links.

I wouldn't want to be responsible for this running in Prod though without a whole lot of real thorough testing through all the possible cases. Still, it might be possible to make it work if you could apply additional assumptions like that the abbreviation always starts in the first non-blank character of Table1.


My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!

My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.


Need to UNPIVOT? Why not CROSS APPLY VALUES instead?
Since random numbers are too important to be left to chance, let's generate some!
Learn to understand recursive CTEs by example.
Splitting strings based on patterns can be fast!
My temporal SQL musings: Calendar Tables, an Easter SQL, Time Slots and Self-maintaining, Contiguous Effective Dates in Temporal Tables
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search