Perform logical AND/OR base on number of rows without dynamic SQL

  • Hi there,

    I have entity where the fields are expressed in rows rather than columns.. I know it is possible to convert this row into columns using combination of PIVOT and dynamic SQL, and use the AND/OR operator in filtering the result.. But, is there other way to solve this without using dynamic SQL?

    In the @Search_Parameter, if more than 1 value is specified in the fieldname column, it would perform OR operator then use AND operator against other fieldnames..

    The query I like to generate is similar to this:

    ...WHERE (lastname='smith' OR lastname='jones') AND firstname='john' AND subject='mathematics' AND (school='school 1' OR school='school 2')

    I came up with this solution: (I'm not sure if isMultiValued column is helpful, you can disregard it when your solution does not need it)

    DECLARE @Student TABLE (id int, fieldname varchar(max), value varchar(max), isMultiValued bit)

    DECLARE @Search_Parameter TABLE (fieldname varchar(max), value varchar(max), isMultiValued bit)

    INSERT INTO @Student

    SELECT 1, 'lastname', 'smith', 0 UNION ALL

    SELECT 1, 'firstname', 'john', 0 UNION ALL

    SELECT 1, 'subject', 'mathematics', 1 UNION ALL

    SELECT 1, 'subject', 'geometry', 1 UNION ALL

    SELECT 1, 'school', 'school 1', 1 UNION ALL

    SELECT 2, 'lastname', 'jones', 0 UNION ALL

    SELECT 2, 'firstname', 'john', 0 UNION ALL

    SELECT 2, 'subject', 'mathematics', 1 UNION ALL

    SELECT 2, 'school', 'school 1', 1 UNION ALL

    SELECT 2, 'school', 'school 2', 1

    --SELECT * FROM @Student

    INSERT INTO @Search_Parameter

    SELECT 'lastname','smith', 0 UNION ALL

    SELECT 'lastname','jones', 0 UNION ALL

    SELECT 'firstname','john', 0 UNION ALL

    SELECT 'subject','mathematics', 1 UNION ALL

    SELECT 'school','school 1', 1 UNION ALL

    SELECT 'school','school 2', 1

    --SELECT * FROM @Search_Parameter

    --filtering condition is similar to this one:

    --(lastname='smith' OR lastname='jones') AND firstname='john' AND subject='mathematics' AND (school='school 1' OR school='school 2')

    SELECT DISTINCT id

    FROM @Student s

    WHERE NOT EXISTS (SELECT s2.fieldname,s2.value

    FROM @Student s2

    WHERE s2.isMultiValued = 0

    AND s2.id = s.id

    EXCEPT

    SELECT fieldname,value

    FROM @Search_Parameter

    WHERE isMultiValued = 0

    )

    AND NOT EXISTS (

    SELECT fieldname

    FROM @Search_Parameter

    WHERE isMultiValued = 1

    EXCEPT

    SELECT x.fieldname

    FROM

    (SELECT s2.fieldname,s2.value

    FROM @Student s2

    WHERE s2.isMultiValued = 1

    AND s2.id = s.id

    INTERSECT

    SELECT fieldname,value

    FROM @Search_Parameter

    WHERE isMultiValued = 1

    ) x

    )

    Is there other way to solve this?

    Thanks..

  • I used EXCEPT and INTERSECT to avoid dynamic SQL.. Will this impose performance issue than its dynamic SQL counterpart? Please advise.. I would really appreciate your input..

    Thanks.. 🙂

  • Hi there,

    I tried this one:

    SELECT ID FROM @Student s

    EXCEPT

    SELECT sf.ID FROM @Student s

    INNER JOIN @Search_Parameter sp ON (s.fieldname=sp.fieldname AND s.value=sp.value)

    RIGHT OUTER JOIN @Student sf ON (sf.id=s.id AND sf.fieldname=s.fieldname)

    WHERE s.id IS NULL

    it seems a bit faster

    using this data

    SELECT 1, 'lastname', 'smith', 0 UNION ALL

    SELECT 1, 'firstname', 'john', 0 UNION ALL

    SELECT 1, 'subject', 'mathematics', 1 UNION ALL

    SELECT 1, 'subject', 'geometry', 1 UNION ALL

    SELECT 1, 'school', 'school 1', 1 UNION ALL

    SELECT 2, 'lastname', 'jones', 0 UNION ALL

    SELECT 2, 'firstname', 'john', 0 UNION ALL

    SELECT 2, 'subject', 'mathematics', 1 UNION ALL

    SELECT 2, 'school', 'school 1', 1 UNION ALL

    SELECT 2, 'school', 'school 2', 1 UNION ALL

    SELECT 3, 'lastname', 'jones', 0 UNION ALL

    SELECT 3, 'firstname', 'john', 0 UNION ALL

    SELECT 3, 'subject', 'science', 1 UNION ALL

    SELECT 3, 'school', 'school 1', 1 UNION ALL

    SELECT 3, 'school', 'school 2', 1

    i got these results

    Mine

    [Query Cost (relative to batch): 18%]

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 0 ms.

    (2 row(s) affected)

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table '#00C31E99'. Scan count 15, logical reads 17, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table '#01B742D2'. Scan count 13, logical reads 13, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    Shield's

    [Query Cost (relative to batch): 25%]

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 0 ms.

    (2 row(s) affected)

    Table '#01B742D2'. Scan count 17, logical reads 80, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table '#00C31E99'. Scan count 46, logical reads 46, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 3 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 0 ms.

    btw, thanks again for teaching me statistics time and IO ^__^

    _____________________________________________
    [font="Comic Sans MS"]Quatrei Quorizawa[/font]
    :):D:P;):w00t::cool::hehe:
    MABUHAY PHILIPPINES!

    "Press any key...
    Where the heck is the any key?
    hmmm... Let's see... there's ESC, CTRL, Page Up...
    but no any key"
    - Homer Simpson
  • @Quatrei

    Thanks you very much for your help.. Your code is much cleaner and faster now! And this also does not use dynamic SQL.. 😀

    However, I just have this one challenge wherein a field is not present for student.. I just commented the line from your sample data..

    DECLARE @Student TABLE (id int, fieldname varchar(max), value varchar(max), isMultiValued bit)

    DECLARE @Search_Parameter TABLE (fieldname varchar(max), value varchar(max), isMultiValued bit)

    INSERT INTO @Student

    SELECT 1, 'lastname', 'smith', 0 UNION ALL

    SELECT 1, 'firstname', 'john', 0 UNION ALL

    SELECT 1, 'subject', 'mathematics', 1 UNION ALL

    SELECT 1, 'subject', 'geometry', 1 UNION ALL

    --SELECT 1, 'school', 'school 1', 1 UNION ALL

    SELECT 2, 'lastname', 'jones', 0 UNION ALL

    --SELECT 2, 'firstname', 'john', 0 UNION ALL

    SELECT 2, 'subject', 'mathematics', 1 UNION ALL

    SELECT 2, 'school', 'school 1', 1 UNION ALL

    SELECT 2, 'school', 'school 2', 1 UNION ALL

    SELECT 3, 'lastname', 'jones', 0 UNION ALL

    SELECT 3, 'firstname', 'john', 0 UNION ALL

    SELECT 3, 'subject', 'science', 1 UNION ALL

    SELECT 3, 'school', 'school 1', 1 UNION ALL

    SELECT 3, 'school', 'school 2', 1

    INSERT INTO @Search_Parameter

    SELECT 'lastname','smith', 0 UNION ALL

    SELECT 'lastname','jones', 0 UNION ALL

    SELECT 'firstname','john', 0 UNION ALL

    SELECT 'subject','mathematics', 1 UNION ALL

    SELECT 'school','school 1', 1 UNION ALL

    SELECT 'school','school 2', 1

    The expected result would be no record in this case because there is no record where (firstname='john') AND (school='school 1' OR school='school 2') and subject is mathematics...

    When I run your query, it still displays 2 record.. But when I run mine, it displays one record.. Still have bugs 😛

    Thanks for your insight.. I don't really need the column IsMultiValued.. I modified the code earlier and came up with this one..

    SELECT DISTINCT id

    FROM @Student s

    WHERE NOT EXISTS (

    SELECT fieldname

    FROM @Search_Parameter

    EXCEPT

    SELECT x.fieldname

    FROM(

    SELECT s2.fieldname,s2.value

    FROM @Student s2

    WHERE s2.id = s.id

    INTERSECT

    SELECT fieldname,value

    FROM @Search_Parameter

    ) x

    )

    I believe this is not yet the best solution but at least it produces the desired output.. You can use statistics time, io, and other tools to prove this.. 😛 I'm glad to see you're using it..

    Thanks!

  • @joe,

    Thank you for showing me the right direction.. So, this is called EAV model.. I'll stay away from this method and migrate to XML.. With proper xml indexes, it will be better..

    Thanks again..

  • When i was searching for EAV, I came across with this article by Joe Celko (thanks!):

    http://www.simple-talk.com/sql/t-sql-programming/divided-we-stand-the-sql-of-relational-division/

    This presents different methods for relational division.. To my surprise, the topic under "Division with Set Operators" somehow resembles the code I posted earlier using EXCEPT.. This is what I'm looking for. 😀

    I know it will be quite pointless to post the simpler version, since I'll be moving to XML.. But, someone might find this useful.. I'll post it anyway.. 😛

    SELECT DISTINCT id

    FROM @Student s

    WHERE NOT EXISTS (

    SELECT fieldname

    FROM @Search_Parameter

    EXCEPT

    SELECT sp.fieldname

    FROM @Student s2

    JOIN @Search_Parameter sp ON (sp.fieldname=s2.fieldname AND sp.value=s2.value)

    WHERE s2.id = s.id

    )

    Simple JOIN can replace the INTERSECT and derived table..

    Thanks everyone!

  • XML? Why oh why?

    What's wrong with a properly normalised database design?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • CREATE TABLE dbo.SAMPLE (data XML)

    😀

    Much blowing of vuvuzelas ensues.....

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • Hi,

    just read the forum today. I know your gonna use XML already but... :hehe:I still wanna have fun with this just for additional knowledge.

    DECLARE @Student TABLE (id int, fieldname varchar(max), value varchar(max), isMultiValued bit)

    DECLARE @Search_Parameter TABLE (fieldname varchar(max), value varchar(max), isMultiValued bit)

    INSERT INTO @Student

    SELECT 1, 'lastname', 'smith', 0 UNION ALL

    SELECT 1, 'firstname', 'john', 0 UNION ALL

    SELECT 1, 'subject', 'mathematics', 1 UNION ALL

    SELECT 1, 'subject', 'geometry', 1 UNION ALL

    SELECT 1, 'school', 'school 1', 1 UNION ALL

    SELECT 2, 'lastname', 'jones', 0 UNION ALL

    SELECT 2, 'firstname', 'john', 0 UNION ALL

    SELECT 2, 'subject', 'mathematics', 1 UNION ALL

    SELECT 2, 'school', 'school 1', 1 UNION ALL

    SELECT 2, 'school', 'school 2', 1 UNION ALL

    SELECT 3, 'lastname', 'jones', 0 UNION ALL

    SELECT 3, 'firstname', 'john', 0 UNION ALL

    SELECT 3, 'subject', 'science', 1 UNION ALL

    SELECT 3, 'school', 'school 1', 1 UNION ALL

    SELECT 3, 'school', 'school 2', 1 UNION ALL

    SELECT 4, 'lastname', 'smith', 0 UNION ALL

    SELECT 4, 'firstname', 'john', 0 UNION ALL

    SELECT 4, 'subject', 'mathematics', 1 UNION ALL

    SELECT 4, 'subject', 'geometry', 1 UNION ALL

    --SELECT 4, 'school', 'school 1', 1 UNION ALL

    SELECT 5, 'lastname', 'jones', 0 UNION ALL

    --SELECT 5, 'firstname', 'john', 0 UNION ALL

    SELECT 5, 'subject', 'mathematics', 1 UNION ALL

    SELECT 5, 'school', 'school 1', 1 UNION ALL

    SELECT 5, 'school', 'school 2', 1

    INSERT INTO @Search_Parameter

    SELECT 'lastname','smith', 0 UNION ALL

    SELECT 'lastname','jones', 0 UNION ALL

    SELECT 'firstname','john', 0 UNION ALL

    SELECT 'subject','mathematics', 1 UNION ALL

    SELECT 'school','school 1', 1 UNION ALL

    SELECT 'school','school 2', 1

    SET STATISTICS IO ON

    SET STATISTICS TIME ON

    PRINT '

    Mine version 2'

    SELECT ID FROM @Student s

    EXCEPT

    (

    SELECT DISTINCT sf.ID FROM @Student s

    INNER JOIN @Search_Parameter sp ON (s.fieldname=sp.fieldname AND s.value=sp.value)

    RIGHT OUTER JOIN @Student sf ON (sf.id=s.id AND sf.fieldname=s.fieldname)

    WHERE s.id IS NULL

    UNION

    SELECT DISTINCT s.id

    FROM @Search_Parameter sp

    CROSS JOIN @Student s

    WHERE NOT EXISTS (SELECT 1 FROM @Student i WHERE s.id=i.id AND i.fieldname=sp.fieldname)

    )

    PRINT '

    Shield''s version 2'

    SELECT DISTINCT id

    FROM @Student s

    WHERE NOT EXISTS (

    SELECT fieldname

    FROM @Search_Parameter

    EXCEPT

    SELECT x.fieldname

    FROM(

    SELECT s2.fieldname,s2.value

    FROM @Student s2

    WHERE s2.id = s.id

    INTERSECT

    SELECT fieldname,value

    FROM @Search_Parameter

    ) x

    )

    PRINT '

    Shield''s version 3'

    SELECT DISTINCT id

    FROM @Student s

    WHERE NOT EXISTS (

    SELECT fieldname

    FROM @Search_Parameter

    EXCEPT

    SELECT sp.fieldname

    FROM @Student s2

    JOIN @Search_Parameter sp ON (sp.fieldname=s2.fieldname AND sp.value=s2.value)

    WHERE s2.id = s.id

    )

    PRINT '

    Mine version 2 + Shield''s version 3'

    SELECT ID FROM @Student s

    EXCEPT

    (

    SELECT DISTINCT sf.ID FROM @Student s

    INNER JOIN @Search_Parameter sp ON (s.fieldname=sp.fieldname AND s.value=sp.value)

    RIGHT OUTER JOIN @Student sf ON (sf.id=s.id AND sf.fieldname=s.fieldname)

    WHERE s.id IS NULL

    UNION

    SELECT DISTINCT id

    FROM @Student s

    WHERE EXISTS (

    SELECT fieldname

    FROM @Search_Parameter

    EXCEPT

    SELECT sp.fieldname

    FROM @Student s2

    JOIN @Search_Parameter sp ON (sp.fieldname=s2.fieldname AND sp.value=s2.value)

    WHERE s2.id = s.id

    )

    )

    SET STATISTICS IO OFF

    SET STATISTICS TIME OFF

    also, which is more important, Scan count or logical reads???

    I don't really know which among these is the best implementation or maybe you guys could provide something better. "mine version 2" seems ok but has big logical reads O_O

    just playing around :hehe:

    Thanks in advance

    _____________________________________________
    [font="Comic Sans MS"]Quatrei Quorizawa[/font]
    :):D:P;):w00t::cool::hehe:
    MABUHAY PHILIPPINES!

    "Press any key...
    Where the heck is the any key?
    hmmm... Let's see... there's ESC, CTRL, Page Up...
    but no any key"
    - Homer Simpson
  • GilaMonster (9/12/2010)


    XML? Why oh why?

    What's wrong with a properly normalised database design?

    I think using XML datatype model would be more efficient.. I need to test them both though..

    The query that I came up with (using XML datatype) is not yet the best solution to achieve the desired result.. I avoid the .node method because it would result into similar structure as above.. I'd post it if I have time..

    Thanks!

  • Quatrei.X (9/13/2010)


    also, which is more important, Scan count or logical reads???

    I don't really know which among these is the best implementation or maybe you guys could provide something better. "mine version 2" seems ok but has big logical reads O_O

    Thank you for time and effort to test the performance of different queries..

    I agree that "mine version 2" executes faster in this given sample data.. But as you insert more data, the query becomes slower..

    You may try inserting this one: (Data of student 1 is replicated.. I'm lazy in generating test data.. :hehe:)

    INSERT INTO @Student

    SELECTROW_NUMBER() OVER (PARTITION BY y.fieldname,y.value,y.isMultiValued ORDER BY (SELECT NULL))+5

    ,y.fieldname,y.value,y.isMultiValued

    FROM (

    --same as record 1

    SELECT 'lastname', 'smith', 0 UNION ALL

    SELECT 'firstname', 'john', 0 UNION ALL

    SELECT 'subject', 'mathematics', 1 UNION ALL

    SELECT 'subject', 'geometry', 1 UNION ALL

    SELECT 'school', 'school 1', 1

    ) y(fieldname,value,isMultiValued)

    ,sys.columns

    ORDER BY ROW_NUMBER() OVER (PARTITION BY y.fieldname,y.value,y.isMultiValued ORDER BY (SELECT NULL))

    Based from observation, logical read is more important.. ("mine version 2" has the least scan count but has the biggest logical reads).. Note that Scalar UDFs and CLR can't be measure correctly using SET STATISTICS IO..

    This is other solution which will execute faster for this case:

    SELECT s.id

    FROM @Student AS s

    JOIN @Search_Parameter AS sp ON sp.fieldname=s.fieldname AND sp.value=s.value

    GROUP BY s.id

    HAVING COUNT(DISTINCT s.fieldname)=(SELECT COUNT(DISTINCT fieldname) FROM @Search_Parameter)

    Hope this helps.. Thanks!

  • shield_21 (9/14/2010)


    GilaMonster (9/12/2010)


    XML? Why oh why?

    What's wrong with a properly normalised database design?

    I think using XML datatype model would be more efficient.. I need to test them both though..

    The query that I came up with (using XML datatype) is not yet the best solution to achieve the desired result.. I avoid the .node method because it would result into similar structure as above.. I'd post it if I have time..

    Thanks!

    It is good that you are going to test them both. I would lean to the normalized design as well based on experience with both. In my experiences the normalized has always won out. Despite that - it is good to test.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • This is other solution which will execute faster for this case:

    Is irrelevant , you can only talk about faster / slower / good performing / bad performing once you have a data set that is at least equal to the maximum size you would expect in the real world.

    Testing against 10, 20 ,30 ,100 rows in a table variable is meaningless if you really require a million rows.



    Clear Sky SQL
    My Blog[/url]

  • shield_21 (9/12/2010)


    I know it will be quite pointless to post the simpler version, since I'll be moving to XML

    Let us know how that works out performance wise. I believe you're jumping from fat to fire in this case.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply