SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


More efficient way to perform a lookup?


More efficient way to perform a lookup?

Author
Message
ravi_k
ravi_k
Forum Newbie
Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)

Group: General Forum Members
Points: 9 Visits: 193
I've spent a fair bit of time searching, but can't seem to find a succinct answer...is there a more efficient/faster way of performing the following query? I simply have a lookup table with 45k records and want to search a string field for the presence of any of these lookup values. Currently this query takes 45 min on 800k records. Would it help to index [Table1] on [Field1]? Would a subquery with patindex() be faster?

UPDATE [Table1]
SET [Field2] = 'Yes'
FROM [Table1] A
JOIN [Lookup] as B
ON (A.Field1 like '% ' + B.LookupValue + ' %')

Thanks!
TheSQLGuru
TheSQLGuru
SSC-Dedicated
SSC-Dedicated (32K reputation)SSC-Dedicated (32K reputation)SSC-Dedicated (32K reputation)SSC-Dedicated (32K reputation)SSC-Dedicated (32K reputation)SSC-Dedicated (32K reputation)SSC-Dedicated (32K reputation)SSC-Dedicated (32K reputation)

Group: General Forum Members
Points: 32415 Visits: 8675
r_noob (7/15/2013)
I've spent a fair bit of time searching, but can't seem to find a succinct answer...is there a more efficient/faster way of performing the following query? I simply have a lookup table with 45k records and want to search a string field for the presence of any of these lookup values. Currently this query takes 45 min on 800k records. Would it help to index [Table1] on [Field1]? Would a subquery with patindex() be faster?

UPDATE [Table1]
SET [Field2] = 'Yes'
FROM [Table1] A
JOIN [Lookup] as B
ON (A.Field1 like '% ' + B.LookupValue + ' %')

Thanks!


I think this is going to be a stunningly bad performer regardless. But an index might help. What might be even faster, believe it or not, is to do an EXISTS query instead of a JOIN. Hell, even a cursor approach could be faster than the join, which could hit millions of times on a 45000X800000 join of that type.

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (216K reputation)SSC Guru (216K reputation)SSC Guru (216K reputation)SSC Guru (216K reputation)SSC Guru (216K reputation)SSC Guru (216K reputation)SSC Guru (216K reputation)SSC Guru (216K reputation)

Group: General Forum Members
Points: 216224 Visits: 41986
I agree with Kevin. A looped search would probably be faster than a Join because you could then add a WHERE Field2 <> 'Yes" to the code which would speed things up quite a bit especially if there were some intelligent indexing around that column (at least on a temporary basis).

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Eugene Elutin
Eugene Elutin
SSChampion
SSChampion (12K reputation)SSChampion (12K reputation)SSChampion (12K reputation)SSChampion (12K reputation)SSChampion (12K reputation)SSChampion (12K reputation)SSChampion (12K reputation)SSChampion (12K reputation)

Group: General Forum Members
Points: 12520 Visits: 5478
You may want to try full text search here. Together with using WHERE EXISTS instead of JOIN, should be considerably faster than JOIN with non sargable LIKE. For some reason I think it will win over loop-based aproch too.

_____________________________________________
"The only true wisdom is in knowing you know nothing"
"O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
(So many miracle inventions provided by MS to us...)

How to post your question to get the best and quick help
TheSQLGuru
TheSQLGuru
SSC-Dedicated
SSC-Dedicated (32K reputation)SSC-Dedicated (32K reputation)SSC-Dedicated (32K reputation)SSC-Dedicated (32K reputation)SSC-Dedicated (32K reputation)SSC-Dedicated (32K reputation)SSC-Dedicated (32K reputation)SSC-Dedicated (32K reputation)

Group: General Forum Members
Points: 32415 Visits: 8675
I am not sure full text indexing will help with the '%' + column + '%' join ...

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
Steven Willis
Steven Willis
SSCrazy
SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)

Group: General Forum Members
Points: 2057 Visits: 1721
r_noob (7/15/2013)
I've spent a fair bit of time searching, but can't seem to find a succinct answer...is there a more efficient/faster way of performing the following query? I simply have a lookup table with 45k records and want to search a string field for the presence of any of these lookup values. Currently this query takes 45 min on 800k records. Would it help to index [Table1] on [Field1]? Would a subquery with patindex() be faster?

UPDATE [Table1]
SET [Field2] = 'Yes'
FROM [Table1] A
JOIN [Lookup] as B
ON (A.Field1 like '% ' + B.LookupValue + ' %')

Thanks!


Try this:



IF OBJECT_ID('tempdb..#TempTable') IS NOT NULL
DROP TABLE #TempTable

CREATE TABLE #TempTable (
[ID] INT IDENTITY(1,1) NOT NULL,
[Field1] INT NULL,
[Field2] NVARCHAR(3) NULL,
PRIMARY KEY (ID))

INSERT INTO #TempTable
SELECT 22,'No' UNION ALL
SELECT 31,'No' UNION ALL
SELECT 42,'No' UNION ALL
SELECT 55,'No'

IF OBJECT_ID('tempdb..#LookupTable') IS NOT NULL
DROP TABLE #LookupTable

CREATE TABLE #LookupTable (
[ID] INT IDENTITY(1,1) NOT NULL,
[LookupValue] INT NULL,
[LookDesc] NVARCHAR(30) NULL,
PRIMARY KEY (ID))

INSERT INTO #LookupTable
SELECT 22,'Apple' UNION ALL
SELECT 33,'Orange' UNION ALL
SELECT 44,'Banana' UNION ALL
SELECT 55,'Pear'

UPDATE #TempTable
SET Field2 = 'Yes'
FROM
#TempTable AS A
INNER JOIN
#LookupTable AS B
ON A.ID = B.ID
WHERE
A.Field1 = B.LookupValue


SELECT * FROM #TempTable AS tt





 
ChrisM@home
ChrisM@home
SSCertifiable
SSCertifiable (5.2K reputation)SSCertifiable (5.2K reputation)SSCertifiable (5.2K reputation)SSCertifiable (5.2K reputation)SSCertifiable (5.2K reputation)SSCertifiable (5.2K reputation)SSCertifiable (5.2K reputation)SSCertifiable (5.2K reputation)

Group: General Forum Members
Points: 5244 Visits: 10607
r_noob (7/15/2013)


... ON (A.Field1 like '% ' + B.LookupValue + ' %')



That looks like words in a sentence. Try Jeff's splitter: split the sentence on space, remove any leading blanks and seek into your lookup table.

Like this:

   SELECT *
FROM #Table1 a
CROSS APPLY dbo.DelimitedSplit8K(a.Field1, ' ') split
WHERE EXISTS (SELECT 1 FROM #Lookup b
WHERE b.LookDesc = split.Item)

SELECT *
FROM #Table1 a
CROSS APPLY dbo.DelimitedSplit8K(a.Field1, ' ') split
INNER LOOP JOIN #Lookup b
ON b.LookDesc = split.Item



Prep:
IF OBJECT_ID('tempdb..#Table1') IS NOT NULL
DROP TABLE #Table1

CREATE TABLE #Table1 (
[ID] INT IDENTITY(1,1) NOT NULL,
[Field1] VARCHAR(100),
[Field2] VARCHAR(3) NULL)

INSERT INTO #Table1
SELECT 'The quick brown fox','No' UNION ALL
SELECT 'jumped over','No' UNION ALL
SELECT 'the','No' UNION ALL
SELECT 'lazy dog','No' UNION ALL
SELECT 'One day in the life of Ivan Denisovitch', 'No'

INSERT INTO #Table1 select a.Field1, 'No' AS Field2 from #Table1 a, sys.columns b

CREATE UNIQUE CLUSTERED INDEX ucx_ID ON #Table1 (ID)



IF OBJECT_ID('tempdb..#Lookup') IS NOT NULL
DROP TABLE #Lookup

CREATE TABLE #Lookup ([LookDesc] VARCHAR(30) NOT NULL)

INSERT INTO #Lookup
SELECT 'dog' UNION ALL
SELECT 'fox' UNION ALL
SELECT TOP 45000 x = RIGHT(NEWID(),5)+RIGHT(NEWID(),5) FROM SYSCOLUMNS A, SYSCOLUMNS B

CREATE UNIQUE CLUSTERED INDEX ucx_LookDesc ON #Lookup (LookDesc)




Low-hanging fruit picker and defender of the moggies





For better assistance in answering your questions, please read this.




Understanding and using APPLY, (I) and (II) Paul White

Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
Stefan_G
Stefan_G
Ten Centuries
Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)

Group: General Forum Members
Points: 1315 Visits: 961
The performance of Chris' solution is fantastic.

If we use his setup script, we can use the following code to perform the update:

IF OBJECT_ID('tempdb..#t2') IS NOT NULL drop table #t2

SELECT DISTINCT ID
INTO #t2
FROM #Table1 a
CROSS APPLY dbo.DelimitedSplit8K(a.Field1, ' ') split
WHERE EXISTS (SELECT 1 FROM #Lookup b WHERE b.LookDesc = split.Item)

UPDATE #Table1
SET Field2 = 'Yes'
WHERE ID in (
select ID from #t2
)



This code executes in less than 100ms on my machine. I expect similar results for the real data. Not bad to go from 45 minutes to sub-second performance!

Well done!

/SG
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search