Need help finding one set of data within another set of data for update

  • Need help finding one set of data within another set of data for update,

    so if within column 1 it finds the results from another select on another table,

    it updates a different column with the result.

    Example

    SELECT Title

    from H..Import_Table

    returns

    'PRN - Concord'

    'PRN - San Jose'

    'SLP - PRN - San Jose'

    'San Jose - PT - PRN'

    'PTA - PRN - Santa Cruz'

    'Rancho Santa Margarita PT - PRN'

    'PT- PRN-San Diego'

    'OT - PRN - Rancho Palos Verdes'

    'SLP - PRN - Rancho Santa Margarita'

    'OTR - PRN - Laguna Hills'

    'PT - PRN - Cardiff by the Sea'

    SELECT City

    from E..City

    returns

    'Concord'

    'San Jose'

    'San Jose'

    'San Jose'

    'Santa Cruz'

    'Rancho Santa Margarita'

    'San Diego'

    'Rancho Palos Verdes'

    'Rancho Santa Margarita'

    'Laguna Hills'

    'Cardiff by the Sea'

    UPDATE H..Import_Table

    SET CITY = CASE WHEN (SELECT city FROM E..City) IN (SELECT title FROM H..Import_Table)

    THEN (SELECT city FROM E..City) ELSE NULL end

    Hopefully I am explaining this well enough.

  • -- Start of what helps us help you

    create table #import (importcity nvarchar(128), title nvarchar(128))

    create table #city (city nvarchar(128))

    insert #import (importcity, title)

    values

    ('SomthingElse', 'PRN - Concord'),

    ('SomthingElse', 'PRN - San Jose'),

    ('SomthingElse', 'SLP - PRN - San Jose'),

    ('SomthingElse', 'San Jose - PT - PRN'),

    ('SomthingElse', 'PTA - PRN - Santa Cruz'),

    ('SomthingElse', 'Rancho Santa Margarita PT - PRN'),

    ('SomthingElse', 'PT- PRN-San Diego'),

    ('SomthingElse', 'OT - PRN - Rancho Palos Verdes'),

    ('SomthingElse', 'SLP - PRN - Rancho Santa Margarita'),

    ('SomthingElse', 'OTR - PRN - Laguna Hills'),

    ('SomthingElse', 'PT - PRN - Cardiff by the Sea')

    insert #city (city)

    values

    ('Concord'),

    ('San Jose'),

    ('San Jose'),

    ('San Jose'),

    ('Santa Cruz'),

    ('Rancho Santa Margarita'),

    ('San Diego'),

    ('Rancho Palos Verdes'),

    ('Rancho Santa Margarita'),

    ('Laguna Hills'),

    ('Cardiff by the Sea')

    -- A sample of the expected results goes here.

    -- End of what helps us help you

    -- the answer (assuming I interpreted the pseudo-T-SQL correctly)

    update #import

    set importcity = city

    from #import join #city

    on charindex(city, title)>0

    /*

    unable to guarantee a distinct city for a given title,

    such as when a title mentions two cities .

    For example, what if there was also a city named 'Rancho'?

    */

  • Good point about partial matches. Presumably you'd want to match on the longest match (?!):

    insert into #import (importcity, title)

    select 'new_city_for_testing', 'rancho'

    insert into #city (city)

    select 'rancho'

    update i

    set importcity = c.city

    from #import i

    cross apply (

    select top (1) c2.city

    from #city c2

    where charindex(c2.city, i.title)>0

    order by len(c2.city) desc

    ) as c

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • Great Answers, I appreciate both. I need to make this a little more complicated though.

    Because of the issue that SoHelpMeCodd brought up in where it may find two cities, I need to also match the on the State that is almost always in the title, and in Caps if that helps.

    Reason being I have some titles like 'Terraces of Boise - Pathologist - Boise, ID' and

    'Wellness Center Aide - Saint Louis, MO' and its finding 'Terrace' as the city instead of 'Boise' or 'Center' as the city instead of 'Saint Louis'

    So I've added to the title the state it also needs to match in the location table that now has both a city and a state.

    I don't mind it NOT presenting a match or leaving the field as NULL if it DOES NOT find a matching state.

    insert #import (importcity, title)

    values

    ('SomthingElse', 'PRN - Concord, NC'),

    ('SomthingElse', 'PRN - San Jose,NM'),

    ('SomthingElse', 'SLP - PRN - San Jose,NM'),

    ('SomthingElse', 'San Jose, NM - PT - PRN'),

    ('SomthingElse', 'PTA - PRN - Santa Cruz,NM'),

    ('SomthingElse', 'Rancho Santa Margarita, CA PT - PRN'),

    ('SomthingElse', 'PT- PRN-San Diego, CA'),

    ('SomthingElse', 'OT - PRN - Rancho Palos Verdes,CA'),

    ('SomthingElse', 'SLP - PRN - Rancho Santa Margarita, CA'),

    ('SomthingElse', 'OTR - PRN - Laguna Hills,CA'),

    ('SomthingElse', 'PT - PRN - Cardiff by the Sea,CA')

    insert #location (city,state)

    values

    ('Concord','NC'),

    ('San Jose','NM'),

    ('San Jose','CA'),

    ('Santa Cruz','NM'),

    ('Rancho Santa Margarita','CA'),

    ('San Diego','CA'),

    ('Rancho Palos Verdes','CA'),

    ('Rancho Santa Margarita','CA'),

    ('Laguna Hills','CA'),

    ('Cardiff by the Sea','CA')

  • I'm late here, and maybe I'm misunderstanding something, but you can use SUBSTRING with a length param.

    With this, if you have the dash in the string, could you do something like:

    where substring( mystring, charindex('-', mystring), len(mystring)) like '%Terrance%'

    https://msdn.microsoft.com/en-us/library/ms187748.aspx

    https://msdn.microsoft.com/en-us/library/ms186323.aspx

  • The same approach (Scott's CROSS APPLY is faster than my JOIN, due to its use of DISTINCT) used for a City can be used for State. For State you can use the COLLATE clause to specify case-sensitivity. However, doing both JOINS within one statement will cause Collation Precedence rules to be imposed, which may not work if the city is not case-sensitive.

    While an edge case, there is also no guarantee that a City is unique to a State. For example, there are two 'Denton, TX' (http://www.mapquest.com/search/results?page=0&centerOnResults=1&query=denton%20tx).

  • Also late to the party; have you considered stripping the odd text elements out of the column to leave the city name? You'd want to do this as a one-off.

    DROP table #import;create table #import (importcity nvarchar(128), title nvarchar(128))

    insert #import (importcity, title)

    values

    ('SomthingElse', 'PRN - Concord, NC'),

    ('SomthingElse', 'PRN - San Jose,NM'),

    ('SomthingElse', 'SLP - PRN - San Jose,NM'),

    ('SomthingElse', 'San Jose, NM - PT - PRN'),

    ('SomthingElse', 'PTA - PRN - Santa Cruz,NM'),

    ('SomthingElse', 'Rancho Santa Margarita, CA PT - PRN'),

    ('SomthingElse', 'PT - PRN-San Diego, CA'),

    ('SomthingElse', 'OT - PRN - Rancho Palos Verdes,CA'),

    ('SomthingElse', 'SLP - PRN - Rancho Santa Margarita, CA'),

    ('SomthingElse', 'OTR - PRN - Laguna Hills,CA'),

    ('SomthingElse', 'PT - PRN - Cardiff by the Sea,CA')

    SELECT *

    FROM #import

    CROSS APPLY (SELECT CleanedTitle =

    LTRIM(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(title,'OTR - PRN',''),'OT - PRN',''),'PTA - PRN',''),'SLP - PRN',''),'PT - PRN',''),'PRN -',''),'-',''))

    ) x

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply