May 8, 2009 at 4:28 pm
Hi
I have the following table, and I want to obtain the IDs of the rows that have column1 with at least 1 word matched in column 2, so the expected result would be
ID 1 - because there are 2 words in common (a2 a3)
ID 3 - two words in common (c1 c2)
ID 4 - one word in common (n2)
ID ---- STR1 ---- STR2
------------------------------------------------------------
1 ---- a1 a2 a3 ---- a3 a2
2 ---- b3 b2 b1 ---- x4
3 ---- c1 c2 ---- c1 c2
4 ---- n1 n2 ---- n2 y3
The order of the words doesnt matter as long as there is at least one word of STR1 in STR2
Thanks in Advance
May 9, 2009 at 4:26 am
Heh... this is quiet simple (if you are currently dreaming of string-splitting like me). It's not simple.
What you need is to split your data into pieces by " " and JOIN them (in the WHERE clause). Usually I would suggest to combine two traditional Tally based split functions. For detailed information about the Tally table and a set based split function search this site for Jeff Moden's article about the "Tally" (aka Numbers) table.
Special issues of your requirement:
* Your data are not enclosed with the delimiter. Usually set based solutions require this. So you need a UNION ALL tally extension
* You need to split two columns instead of one. So you need a double-barreled approach
For detailed information about any kind of split string approaches have a look to this thread:
http://www.sqlservercentral.com/Forums/Topic695508-338-40.aspx#bm710801
Give this a try:
[font="Courier New"]
DECLARE @t TABLE (Id INT, STR1 VARCHAR(100), STR2 VARCHAR(100))
INSERT INTO @t
SELECT 1, 'a1 a2 a3', 'a3 a2'
UNION ALL SELECT 2, 'b3 b2 b1', 'x4'
UNION ALL SELECT 3, 'c1 c2', 'c1 c2'
UNION ALL SELECT 4, 'n1 n2', 'n2 y3'
; WITH
t1 AS (SELECT 1 N UNION ALL SELECT 2), -- 2
t2 AS (SELECT t1.N FROM t1 t1 CROSS JOIN t1 t2), -- 4
t3 AS (SELECT t1.N FROM t2 t1 CROSS JOIN t2 t2), -- 16
t4 AS (SELECT t1.N FROM t3 t1 CROSS JOIN t3 t2), -- 256
Tally AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) N FROM t4 t1 CROSS JOIN t4 t2) -- 65536
SELECT DISTINCT
s.*
FROM @t s
CROSS APPLY
(
SELECT
-- Get the first item until the first delimiter
SUBSTRING(s.STR1, 1, ISNULL(NULLIF(CHARINDEX(' ', s.STR1, 1) - 1, -1), LEN(s.STR1))) Item
UNION ALL
-- Traditional Tally split with extension to get last item without needed delimiter at the end
SELECT TOP 100 PERCENT
SUBSTRING(s.STR1, t.N + 1, ISNULL(NULLIF(CHARINDEX(' ', s.STR1, t.N + 1) - t.N - 1, -t.N - 1), LEN(s.STR1) - t.N)) Item
FROM Tally t
WHERE t.N <= LEN(s.STR1)
AND SUBSTRING(s.STR1, t.N, 1) = ' '
) l1
CROSS APPLY
(
SELECT
-- Get the first item until the first delimiter
SUBSTRING(s.STR2, 1, ISNULL(NULLIF(CHARINDEX(' ', s.STR2, 1) - 1, -1), LEN(s.STR2))) Item
UNION ALL
-- Traditional Tally split with extension to get last item without needed delimiter at the end
SELECT TOP 100 PERCENT
SUBSTRING(s.STR2, t.N + 1, ISNULL(NULLIF(CHARINDEX(' ', s.STR2, t.N + 1) - t.N - 1, -t.N - 1), LEN(s.STR2) - t.N)) Item
FROM Tally t
WHERE t.N <= LEN(s.STR2)
AND SUBSTRING(s.STR2, t.N, 1) = ' '
) l2
WHERE l1.Item = l2.Item
[/font]
Greets
Flo
Viewing 2 posts - 1 through 2 (of 2 total)
You must be logged in to reply to this topic. Login to reply