Comparing multiple values with column value

  • Hi,

    I'm trying to make a store to return some row's data given some public ids (string).

    I've come across 3 methods to do so but I'm having some trouble finding out witch is the best.

    declare @ads nvarchar(100) = ',1nohFsEuMcy,1nohFsEvaJq,1nohFsEvBnk,'

    declare @adsx xml = '1nohFsEuMcy1nohFsEvaJq1nohFsEvBnk'

    declare @i int

    select id, pubKey from xpt where charindex(','+pubkey+',', @ads) > 0

    select id, pubKey from xpt where exists (select 1 FROM @adsx.nodes('/r/ad') t(n) where pubkey = t.n.value('(.)[1]', 'NVARCHAR(20)') )

    exec sp_xml_preparedocument @i output, @adsx

    select id, pubKey from xpt where exists (select 1 from (select pk from openxml(@i, '/r/ad', 3) with (pk nvarchar(20) '.')) t where pubKey=t.pk )

    exec sp_xml_removedocument @i

    Can any one help me choose witch method is best?

    Thanks,

    Pedro



    If you need to work better, try working less...

  • I'd go with this:

    select id, pubKey

    from xpt

    inner join

    (select ADSX.Nodes.value('(.)[1]','varchar(100)') as ADSXValue

    from @adsx.nodes('/R/AD') ADSX(Nodes)) Sub

    on xpt.pubKey = Sub.ADSXValue;

    Edit: Actually, I'd probably do the derived table as a CTE, since those are easier to read, but my test ended up formatted this way.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • I've also noticed, that since I know the maximum number of rows I'll return, that using the TOP N the SQL has less logical reads since it stops after having N records and doesn't continue searching.

    But to do this I'd have to use the nodes() function twice... Once to make a COUNT() into a variable to use with the SELECT TOP (@variable) and another in the SELECT.

    Would it be better, to read the values the first time into a TABLE VARIABLE and use that TABLE variable instead of the nodes()?

    set statistics io on

    declare @adsx xml = '1nohFsEuMcy1nohFsEvaJq1nohFsEvBnk'

    declare @ids table (pkid nvarchar(20))

    insert into @ids select t.n.value('(.)[1]', 'NVARCHAR(20)') FROM @adsx.nodes('/r/ad') t(n)

    declare @cnt int = (select COUNT(1) from @ids)

    select top (@cnt) id, pubKey from xpt inner join @ids on pubkey = pkid



    If you need to work better, try working less...

  • Adding top to this just adds processor cycles without accomplishing anything.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • I made some tests and notice that... since is made a JOIN with a select and not used the IN clause the number of logical reads is lower.

    select idad, publickey from adsmain inner join (select t.n.value('(.)[1]', 'NVARCHAR(20)') pk FROM @adsx.nodes('/r/ad') t(n)) t on publickey = pk

    Table 'AdsMain'. Scan count 3, logical reads 9, physical reads 1, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    select idad, publickey from adsmain where publickey in (select t.n.value('(.)[1]', 'NVARCHAR(20)') pk FROM @adsx.nodes('/r/ad') t(n))

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'AdsMain'. Scan count 5, logical reads 1538, physical reads 13, read-ahead reads 821, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    much lower 😀

    I executed the following code and asked for the Execution Plan:

    declare @ads nvarchar(100) = ',1nohFsEuMcy,1nohFsEvaJq,1nohFsEvBnk,'

    declare @adsx xml = '1nohFsEuMcy1nohFsEvaJq1nohFsEvBnk'

    select idad, publickey from adsmain where charindex(','+publickey+',', @ads) > 0

    select idad, publickey from adsmain inner join (select t.n.value('(.)[1]', 'NVARCHAR(20)') pk FROM @adsx.nodes('/R/AD') t(n)) t on publickey = pk

    Do you know why SQL Server says that the 1st query only cost 1% of the query batch and the 2nd 99%?

    The 2nd one is best and still SQL Server gives this numbers out!! Is the nodes() function still a mistery to the tunning engine?!

    Thanks in advance,

    Pedro



    If you need to work better, try working less...

  • Execution plans costs and percentages are misleading. They are quite often just plain wrong. Personally, I wish MS would get rid of them, because they cause more problems than they solve.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

Viewing 6 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply