Update CSV string with lookup table

  • I need to replace word(s) in CSV string in table tblA, in the [original] column, by [word] from lookup table tblB. The new CSV string will be stored in tblA table, in the column [modified]. If the [tblB].[synonyms] column doesn't contain the word from CSV in tblA, then the original word will be kept. For example the word flowers is not in [tblB].[synonyms], so flowers will be kept in CSV and saved in [tblA].[modified] column:


    TABLE: tblA

    UPDATED tblA

    I tried to use string_split function for tblB table, but I wasn't able to iterate through the CSV string from tblA. Can someone provide any suggestions? Thanks.


  • can you get used to post data in a consumable way - see on code below one way of doing it.

    one way below to achieve what you need

    drop table if exists #basetable
    drop table if exists #lookuptable
    drop table if exists #templookuptable

    select *
    into #lookuptable
    from (values ('fluid', 'sea|river|ocean')
    , ('hard', 'stone|diamant')
    ) t(word, synonyms)

    select *
    into #basetable
    from (values ('sea|sky', cast(NULL as varchar(100)))
    , ('river|stone', null)
    , ('flower|tree', null)
    ) t(original, modified)

    -- first step - split the replacement strings from #lookuptable into individual records
    -- this is so we can identify those rows from #basetable that need changing and ignore the remaining
    select t.*
    , t2.value
    , '%|' + t2.value + '|%' as searchkey -- note we add start and end delimiter so the search for the main table works regardless of position of the keyword
    into #templookuptable
    from #lookuptable t
    cross apply STRING_SPLIT(t.synonyms, '|', 1) t2

    update bt
    set bt.modified = coalesce(t.Modified, bt.original)
    from #basetable bt
    left outer join (select bt.original
    , STRING_AGG(coalesce(t3.word, t2.value), '|') within group (order by t2.ordinal) as Modified
    from #basetable bt
    cross apply STRING_SPLIT(bt.original, '|', 1) t2
    outer APPLY (select top 1 t3.word
    from #templookuptable t3
    where t3.value = t2.value
    ) t3
    where exists (select 1 from #templookuptable lt
    where '|' + bt.original + '|' like lt.searchkey
    group by bt.original
    ) t
    on t.original = bt.original

    select *
    from #basetable bt
  • Thank you. I need a bit time to understand all parts of it.

  • I tried to learn from your script. I applied it to my project and it seems to be working. Could you please explain me couple of questions, which I don't understand:

    1/ why we need the ordering by t2.ordinal

    2/ I understand how the string a|b|c is split into separate rows, but I don't understand how are the rows combined to get the string 'modified' e|f|g at the end.



  • its all in this line

    " , STRING_AGG(coalesce(t3.word, t2.value), '|') within group (order by t2.ordinal) as Modified"

    plus the group by bt.original

    string_agg converts from rows to a single column - and the "order by t2.ordinal" ensures that the order of the columns (or their replacement) is on the same order as the original string. Without it you could endup with

    • original: river|stone

      modified: hard|fluid

    instead of

    • original: river|stone

      modified: fluid|hard


  • Thanks

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply