Querying records on multiple XML Node elements

  • I can't help but wonder if it'd be faster (and easier) to use XPath/XQuery to get the results you want. A rough stab would be something like:

    Declare @nodename1 varchar(10) = 'e1'
    Declare @nodevalue1 varchar(10) = '404'
    Declare @nodename2 varchar(10) = 'e3'
    Declare @nodevalue2 varchar(10) = 'Yopapa'
    Select *
    From Test T
    Where T.xmlData.exist('/Data/*[local-name()=sql:variable("@nodename1") and text()=sql:variable("@nodevalue1")]') = 1
    And T.xmlData.exist('/Data/*[local-name()=sql:variable("@nodename2") and text()=sql:variable("@nodevalue2")]') = 1

    Doubtless the XML queries can be improved (combined maybe?)

  • andycadley - Sunday, October 14, 2018 8:08 AM

    I can't help but wonder if it'd be faster (and easier) to use XPath/XQuery to get the results you want. A rough stab would be something like:

    Declare @nodename1 varchar(10) = 'e1'
    Declare @nodevalue1 varchar(10) = '404'
    Declare @nodename2 varchar(10) = 'e3'
    Declare @nodevalue2 varchar(10) = 'Yopapa'
    Select *
    From Test T
    Where T.xmlData.exist('/Data/*[local-name()=sql:variable("@nodename1") and text()=sql:variable("@nodevalue1")]') = 1
    And T.xmlData.exist('/Data/*[local-name()=sql:variable("@nodename2") and text()=sql:variable("@nodevalue2")]') = 1

    Doubtless the XML queries can be improved (combined maybe?)

    Well, I dunno which elements will be use to do the search in advance. I also don't know how many will be used. That's the entire reason to pass a table valued parameter to match any and all values sent in. If nothing is sent in, nothing will be returned, but that's a mistake - one shouldn't call the SP in the first place if no filtering is occuring.

    Speaking of performance, done some reading and if I understand correctly JSON performs better than XML even if XML index's are added! So I've kept solution ready with JSON as well. All that was needed is to use OPENJSON. Basically one line change in previous query with XML.

  • Pagan DBA - Sunday, October 14, 2018 8:24 AM

    andycadley - Sunday, October 14, 2018 8:08 AM

    I can't help but wonder if it'd be faster (and easier) to use XPath/XQuery to get the results you want. A rough stab would be something like:

    Declare @nodename1 varchar(10) = 'e1'
    Declare @nodevalue1 varchar(10) = '404'
    Declare @nodename2 varchar(10) = 'e3'
    Declare @nodevalue2 varchar(10) = 'Yopapa'
    Select *
    From Test T
    Where T.xmlData.exist('/Data/*[local-name()=sql:variable("@nodename1") and text()=sql:variable("@nodevalue1")]') = 1
    And T.xmlData.exist('/Data/*[local-name()=sql:variable("@nodename2") and text()=sql:variable("@nodevalue2")]') = 1

    Doubtless the XML queries can be improved (combined maybe?)

    Well, I dunno which elements will be use to do the search in advance. I also don't know how many will be used. That's the entire reason to pass a table valued parameter to match any and all values sent in. If nothing is sent in, nothing will be returned, but that's a mistake - one shouldn't call the SP in the first place if no filtering is occuring.

    Speaking of performance, done some reading and if I understand correctly JSON performs better than XML even if XML index's are added! So I've kept solution ready with JSON as well. All that was needed is to use OPENJSON. Basically one line change in previous query with XML.

    I would test that against  your queries.  I have not found any significant difference in perf, neither better nor worse.  Frankly work with what you are natively getting from the source, and what's easiest supported there, unless your org has a hard preference for one or the other.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Matt Miller (4) - Sunday, October 14, 2018 1:17 PM

    I would test that against  your queries.  I have not found any significant difference in perf, neither better nor worse.  Frankly work with what you are natively getting from the source, and what's easiest supported there, unless your org has a hard preference for one or the other.

    Point.

    So I'm going to wait to get some data. Besides volumes, need to see access patterns. For some it could be write once and keep querying. For others it could update parts of JSON/XML often but not query so much. Right now I am focused on making sure I can search against arbitrary content. Querying aside if I find "patching" the XML/JSON is more efficient one way than the other, then I might even allow for clients to decide how to store their data. They can pick their poison in case one works better than the other for individual situation.

  • Yes, I'd be surprised if a blanket statement such as JSON is always faster than XML were true with SQL Server, nothing is ever quite that simple. Using a table variable with multiple rows doesn't have to eliminate doing the seeking with XQuery though, you just have to use sql:column rather than sql:variable (and structure the query to cope with multiple criteria in a suitable way depending on whether it's and AND or OR type combination).

  • andycadley - Sunday, October 14, 2018 8:08 AM

    I can't help but wonder if it'd be faster (and easier) to use XPath/XQuery to get the results you want. A rough stab would be something like:

    Declare @nodename1 varchar(10) = 'e1'
    Declare @nodevalue1 varchar(10) = '404'
    Declare @nodename2 varchar(10) = 'e3'
    Declare @nodevalue2 varchar(10) = 'Yopapa'
    Select *
    From Test T
    Where T.xmlData.exist('/Data/*[local-name()=sql:variable("@nodename1") and text()=sql:variable("@nodevalue1")]') = 1
    And T.xmlData.exist('/Data/*[local-name()=sql:variable("@nodename2") and text()=sql:variable("@nodevalue2")]') = 1

    Doubtless the XML queries can be improved (combined maybe?)

    Careful here, the exist function in the where clause introduces separate xml readers for each predicate, in this case four xml readers, full table scan, constant scan, three nested loops and a lazy spool to join up the results from each function. In total the plan has eight nested loops, nine filters, two lazy spools, four scalar computations and a full table scan.
    😎

    This is significantly more complex than the plan for my code

  • andycadley - Sunday, October 14, 2018 2:16 PM

    Yes, I'd be surprised if a blanket statement such as JSON is always faster than XML were true with SQL Server, nothing is ever quite that simple. Using a table variable with multiple rows doesn't have to eliminate doing the seeking with XQuery though, you just have to use sql:column rather than sql:variable (and structure the query to cope with multiple criteria in a suitable way depending on whether it's and AND or OR type combination).

    Okay, maybe I'm not understanding what you are saying.

    I do not know "e1" or "e2" or ... will be passed it. It could be "e495" or "e23'. I don't know the names of the node elements. I don't know how many node elements will be passed to match in the xml. All I know is the structure of the data is single element under <data></data>. 

    In your example, you have one row for each element passed in. Now, like I said in an earlier post, it might be reasonable to say clients can pass unto 4 elements for filtering. If that's going to be faster then I will run an experiment when I get the data. Like I keep saying, I wanted code to work generically, and therefore I wanted to pass a table variable with any number of rows, one for each element and value client wants to filter on.

  • Pagan DBA - Monday, October 15, 2018 6:06 AM

    andycadley - Sunday, October 14, 2018 2:16 PM

    Yes, I'd be surprised if a blanket statement such as JSON is always faster than XML were true with SQL Server, nothing is ever quite that simple. Using a table variable with multiple rows doesn't have to eliminate doing the seeking with XQuery though, you just have to use sql:column rather than sql:variable (and structure the query to cope with multiple criteria in a suitable way depending on whether it's and AND or OR type combination).

    Okay, maybe I'm not understanding what you are saying.

    I do not know "e1" or "e2" or ... will be passed it. It could be "e495" or "e23'. I don't know the names of the node elements. I don't know how many node elements will be passed to match in the xml. All I know is the structure of the data is single element under <data></data>. 

    In your example, you have one row for each element passed in. Now, like I said in an earlier post, it might be reasonable to say clients can pass unto 4 elements for filtering. If that's going to be faster then I will run an experiment when I get the data. Like I keep saying, I wanted code to work generically, and therefore I wanted to pass a table variable with any number of rows, one for each element and value client wants to filter on.

    Will the users pass the element names?
    😎

  • Eirikur Eiriksson - Monday, October 15, 2018 6:14 AM

    Pagan DBA - Monday, October 15, 2018 6:06 AM

    Will the users pass the element names?
    😎

    Yes they will. Just like First Name in UI maps to (say) first_nm column in SQL query. Since end "user" is choosing to persist named data, i.e. "e1", e2", whatever, they have to ask to filter by "e1", "e2", etc. Garbage In, Garbage Out. They can persist "something" which are canned fields. Rest is up to them.

  • Eirikur Eiriksson - Monday, October 15, 2018 6:00 AM

    andycadley - Sunday, October 14, 2018 8:08 AM

    I can't help but wonder if it'd be faster (and easier) to use XPath/XQuery to get the results you want. A rough stab would be something like:

    Declare @nodename1 varchar(10) = 'e1'
    Declare @nodevalue1 varchar(10) = '404'
    Declare @nodename2 varchar(10) = 'e3'
    Declare @nodevalue2 varchar(10) = 'Yopapa'
    Select *
    From Test T
    Where T.xmlData.exist('/Data/*[local-name()=sql:variable("@nodename1") and text()=sql:variable("@nodevalue1")]') = 1
    And T.xmlData.exist('/Data/*[local-name()=sql:variable("@nodename2") and text()=sql:variable("@nodevalue2")]') = 1

    Doubtless the XML queries can be improved (combined maybe?)

    Careful here, the exist function in the where clause introduces separate xml readers for each predicate, in this case four xml readers, full table scan, constant scan, three nested loops and a lazy spool to join up the results from each function. In total the plan has eight nested loops, nine filters, two lazy spools, four scalar computations and a full table scan.
    😎

    This is significantly more complex than the plan for my code

    It's more complex, but has a lower cost and consistently outperforms the simpler plan on my local machine (SQL 2017) as well as allowing you to return the XML without another query. That may not always be the case, but it's worth checking.

Viewing 10 posts - 16 through 24 (of 24 total)

You must be logged in to reply to this topic. Login to reply