July 11, 2012 at 12:33 pm
I have 2 tables like so:
PRODUCT [SKU varchar(10); SHORT_DESCRIPTION varchar(255)] - (Approx record count 800K)
KEYWORD [KEYWORD varchar(50)] - (Approx record count 300K)
What I need to do is find any records in the PRODUCT table that contains in the SHORT_DESCRIPTION field any of the keywords in the KEYWORD table. Until now the keyword table was very small (10 or less) and I used a cross join to the products table and a like %keyword% in the where clause. BUT - the pain happens when you try to scale at all (Let alone to 300!!!).
I am looking for alternatives especially ones that scale.
A few ideas rolling around my head to perhaps test but did not yet implement: -Pivot the keywords table and then inner join with products (so I end up with the keywords horizontally 1 line per record instead of vertically, but would then need dynamic sql to perform the like % filter)
-ARGGH - Cursor over the keywords table one at a time in place of that nasty cross join
Thank You!
July 11, 2012 at 12:38 pm
A small example of the overall problem may help with understanding.
July 11, 2012 at 12:44 pm
Sorry about that.
Table "Product" has items with descriptions.
Table "Keywords" is a running list of keywords that we would want to exclude from the outcome of some queries.
What I am looking to get, is a list of skus (from the product table) where the description contains any of the keywords in the product table. Until now we cross-joined the 2 and used a like %keyword% in the where clause. That didnt scale hence...
July 11, 2012 at 12:54 pm
sammyIT (7/11/2012)
Sorry about that.Table "Product" has items with descriptions.
Table "Keywords" is a running list of keywords that we would want to exclude from the outcome of some queries.
What I am looking to get, is a list of skus (from the product table) where the description contains any of the keywords in the product table. Until now we cross-joined the 2 and used a like %keyword% in the where clause. That didnt scale hence...
What we really need is ddl and sample data. Your description is incredibly vague and does not lend itself to the volunteers around here finding a solution for you. Take a look at the first link in my signature for best practices when posting questions.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
July 11, 2012 at 12:59 pm
Lucky for you I had a few minutes of downtime.
Since there is no sample data or ddl I created my own to demonstrate one approach to this.
create table #Products
(
ProductID int identity not null primary key,
SKU varchar(10),
ShortDesc varchar(255)
)
create table #Keywords
(
KeyWord varchar(50)
)
insert #Products
select '123456', 'This is a keyword description.' union all
select '234123', 'This is also a bad one.' union all
select '484756', 'thiis is good.' union all
select '867584', 'bad one is here'
Insert #Keywords
select 'this' union all
select 'bad'
--First this will show you the complete list of "bad words"
select *
from #Products p
cross apply #Keywords kw
cross apply dbo.DelimitedSplit8K(ShortDesc, ' ') x
where kw.Keyword = x.item
--now to view only those products who have at least one match
select *
from #Products
where ProductID in
(
select p.ProductID
from #Products p
cross apply #Keywords kw
cross apply dbo.DelimitedSplit8K(ShortDesc, ' ') x
where kw.Keyword = x.item
)
drop table #Products
drop table #Keywords
For details about the DelimitedSplit8K function you can view the article in my signature about splitting strings. 😀
This works for the sample data I provided which I THINK matches your issue.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
July 12, 2012 at 3:40 pm
Thank you all!!! I will get to work and post back if I encounter any issues.
Viewing 6 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply