• born2achieve (9/28/2013)


    Hi,

    I have two tables named "Table1" and "Table2".

    Table1 Details:

    id bigint, product_name nvarchar(1000),quantity int

    records count on Table1 : 25000( may increase in future)

    Table2 Details:

    id bigint, product_name nvarchar(max),details nvarchar(1000), description nvarchar(1000)

    Note: Table2 productName column will have comma separated values

    records count on Table2 : 186289( may increase in future)

    sample data:

    Table1:

    1 canola 120

    2 bread 130

    3 sauce 140

    4 corn 120

    Table2:

    1 canola,tea,muffin,cheese jellyproducts null

    2 vinegar,canola,sunflower oliproducts null

    3. cornil,vegoil,canola,sesameoil oilproducts null

    my requirement is i will take each prodcutname from Table1 and will search it in Table2's prodcutname column. if matches found then get the row and insert into temp table.

    i tried with while loop/cursor with "like" condition but it takes 15+ hours to process the records. I know searching in comma separate data with "like" will take time, but it take 15+ hours and still running.

    if am not wrong that looping will be bad on this larger records. so i have tried with joins as well. but nothing seems reducing the processing time.

    Try1 :

    select T2.*

    from dbo.Table1 t1

    inner join dbo.Table2 t2 on t2.product_name like '%' + t1.product_name + '%'

    Try2:

    select T2.*

    from dbo.Table2 T2

    inner join dbo.Table1 T1 on CHARINDEX(T1.product_name, T2.product_name) > 0

    Is there any fastest way to achieve this requirement? please help me on this and share with me sample code if there is any.

    Is there any duplication of single product names in the product_name column of Table 2?

    Also, the absolute BEST way to do this would be to properly normalize Table 2. CSV columns are one of the worst things you can store in a permanent table in a database for all the reasons that you're now finding out.

    Also, for future posts, it's best if you post readily consumable test data. Please see the article at the first link under "Helpful Links" in my signature line below.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)