Querying records on multiple XML Node elements

  • So I've been trying to solve my problem for a few days now with the help of the famous XMLTable by Jacob Sebastian (where are you?!?!?!). Perhaps using XMLTable is itself the issue because my problem cannot be solved by it. Regardless, thought I would post because it seems to me it would be a common problem to solve when using structured but non tabular data .

    Sample Data
    CREATE TABLE dbo.Test
    (
    id INT IDENTITY PRIMARY KEY,
    something VARCHAR (1024),
    xmlData XML
    );

    INSERT INTO dbo.Test
    SELECT something = 'first', xmlData = '<Data><e1>404</e1><e2>Dont Care</e2><e3>Yomama</e3></Data>'

    INSERT INTO dbo.Test
    SELECT something = 'second', xmlData = '<Data><e1>500</e1><e2>Whoop Dee Do</e2><e3>Yikes</e3></Data>'

    INSERT INTO dbo.Test
    SELECT something = 'third', xmlData = '<Data><e1>200</e1><e2>Nice</e2></Data>'

    INSERT INTO dbo.Test
    SELECT something = 'fourth', xmlData = '<Data><e1>404</e1><e2>Ho hum</e2><e3>Yopapa</e3></Data>'

    INSERT INTO dbo.Test
    SELECT something = 'fifth', xmlData = '<Data><e1>200</e1><e2>Nice</e2><e3>Fantastic</e3></Data>'

    INSERT INTO dbo.Test
    SELECT something = 'sixth', xmlData = '<Data><e2>Whoop Dee Do</e2><e3>Whoa!</e3></Data>'


    Basic Question

    How do I write query to select the 'fourth' row? The following does not work. I know why it does not work because XMLTable function returns separate rows for e1 and e3, but I feel there has to be a way somehow.

    SELECT * FROM dbo.Test
    CROSS APPLY fn_xml_table(xmlData)
    WHERE NodeName = 'e1' AND Value = '404' AND NodeName = 'e3' AND Value = 'Yopapa'

    Advanced Question
    Cannot do this until I solve the Basic Question above. What I want to do is include my query in a Stored Procedure and Pass a Table Valued Parameter to it with values for e1 and e3 (basically whatever node element values I want to match) like this
    Name | Value
    ------------------
    e1      | 404
    e3      | Yopapa

    and have it return the 'fourth' row.

    Right now, I'm doing some nasty SQL with UNION and concatenation based on whether I want to pass values for e1/e2/e3, all of them, some of them and then using EXEC, and it doesn't really work and good for me I have a costly laptop so my fist does not go through the screen.

    If someone can at least tell me XMLTable is not the way to go for starters, and offer any other advice?

  • I am sorry - but I don't recall  Jacob's XMl table function.  That said - the direct SQLXML query to get that back looks something like :

    select test.*
     from test cross apply xmldata.nodes('/') a(b)
    where b.value('(Data/e1)[1]','int')=404 and
      b.value('(Data/e3)[1]','varchar(30)')='Yopapa'

    As to your advanced question, you'd want to combine that (should you choose to do so) with something akin to a "catch all" query as described here:

    https://sqlinthewild.co.za/index.php/2009/03/19/catch-all-queries/

    Now as to as to the BIG question ("should I do even be using XMLtable?") - the first question would be - what's the actual source of data?  If you're building the XML table from an existing data source, then a resounding NO would be my first reaction.  If you really have just XML then perhaps shed a bit more light on what you actually are up against.  As of now - it's not easy to provide any solid advice.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • This should get you started
    😎
    By handling the set as an EAV, one can avoid the perils of the "catch all"


    USE TEEST;
    GO
    SET NOCOUNT ON;
    -- https://www.sqlservercentral.com/Forums/2002704/Querying-records-on-multiple-XML-Node-elements
    IF OBJECT_ID(N'dbo.TestXML05') IS NOT NULL DROP TABLE dbo.TestXML05;
    CREATE TABLE dbo.TestXML05
    (
    id INT IDENTITY PRIMARY KEY,
    something VARCHAR (1024),
    xmlData XML
    );

    INSERT INTO dbo.TestXML05
    SELECT something = 'first', xmlData = '<Data><e1>404</e1><e2>Dont Care</e2><e3>Yomama</e3></Data>'  UNION ALL
    SELECT something = 'second', xmlData = '<Data><e1>500</e1><e2>Whoop Dee Do</e2><e3>Yikes</e3></Data>' UNION ALL
    SELECT something = 'third', xmlData = '<Data><e1>200</e1><e2>Nice</e2></Data>'         UNION ALL
    SELECT something = 'fourth', xmlData = '<Data><e1>404</e1><e2>Ho hum</e2><e3>Yopapa</e3></Data>'   UNION ALL
    SELECT something = 'fifth', xmlData = '<Data><e1>200</e1><e2>Nice</e2><e3>Fantastic</e3></Data>'   UNION ALL
    SELECT something = 'sixth', xmlData = '<Data><e2>Whoop Dee Do</e2><e3>Whoa!</e3></Data>';

    -- SEARCH PREDICATES
    DECLARE @ELEMENT_NAME1 VARCHAR(50) = 'e1';
    DECLARE @ELEMENT_VALUE1 VARCHAR(50) = '404';
    DECLARE @ELEMENT_NAME2 VARCHAR(50) = 'e3';
    DECLARE @ELEMENT_VALUE2 VARCHAR(50) = 'Yopapa';
    DECLARE @MATCHCOUNT  INT   = 2;

    -- SEARCH QUERY
    ;WITH BASE_DATA AS
    (
      SELECT
       T5.id
       ,T5.something
       ,TX.DATA.value('local-name(.)','varchar(50)')  AS ELEMENT_NAME
       ,TX.DATA.value('(./text())[1]','varchar(50)')  AS ELEMENT_VALUE
      FROM  dbo.TestXML05 T5
      OUTER APPLY T5.xmlData.nodes('/Data//*') TX(DATA)
    )
    SELECT
      BD.id
     ,BD.something
    FROM  BASE_DATA BD
    WHERE
      (
       BD.ELEMENT_NAME  = @ELEMENT_NAME1
      AND 
       BD.ELEMENT_VALUE  = @ELEMENT_VALUE1
      )
    OR
      (
       BD.ELEMENT_NAME  = @ELEMENT_NAME2
      AND 
       BD.ELEMENT_VALUE  = @ELEMENT_VALUE2
      )
    GROUP BY
      BD.id
     ,BD.something
    HAVING COUNT(*) = @MATCHCOUNT;
    ;


    Output
    id    something
    4    fourth

  • Matt Miller (4) - Friday, October 12, 2018 8:07 PM

    I am sorry - but I don't recall  Jacob's XMl table function.  That said - the direct SQLXML query to get that back looks something like :

    select test.*
     from test cross apply xmldata.nodes('/') a(b)
    where b.value('(Data/e1)[1]','int')=404 and
      b.value('(Data/e3)[1]','varchar(30)')='Yopapa'

    As to your advanced question, you'd want to combine that (should you choose to do so) with something akin to a "catch all" query as described here:

    https://sqlinthewild.co.za/index.php/2009/03/19/catch-all-queries/

    Now as to as to the BIG question ("should I do even be using XMLtable?") - the first question would be - what's the actual source of data?  If you're building the XML table from an existing data source, then a resounding NO would be my first reaction.  If you really have just XML then perhaps shed a bit more light on what you actually are up against.  As of now - it's not easy to provide any solid advice.

    Thank you for your reply.

    First, I was referring to the following work: https://jacobsebastian.com/category/xml/
    Second, and needless to say I'm going to try work with what you provided. To answer your other question, I need "clients" to be able to persist information such as e1, e2, e3, ... The set of these elements is finite, BUT not each will persist all fields. In my sample data you might have noticed in one row I excluded one of the fields, assuming set is finite at e1, e2, e3.

    So now I need to take your solution and then join it with my table valued parameter in the 'where' clause since I can't hard code e1,e3 values for querying as they need to be supplied by client at runtime.

  • Eirikur Eiriksson - Saturday, October 13, 2018 2:45 AM

    This should get you started
    😎
    By handling the set as an EAV, one can avoid the perils of the "catch all"


    USE TEEST;
    GO
    SET NOCOUNT ON;
    -- https://www.sqlservercentral.com/Forums/2002704/Querying-records-on-multiple-XML-Node-elements
    IF OBJECT_ID(N'dbo.TestXML05') IS NOT NULL DROP TABLE dbo.TestXML05;
    CREATE TABLE dbo.TestXML05
    (
    id INT IDENTITY PRIMARY KEY,
    something VARCHAR (1024),
    xmlData XML
    );

    INSERT INTO dbo.TestXML05
    SELECT something = 'first', xmlData = '<Data><e1>404</e1><e2>Dont Care</e2><e3>Yomama</e3></Data>'  UNION ALL
    SELECT something = 'second', xmlData = '<Data><e1>500</e1><e2>Whoop Dee Do</e2><e3>Yikes</e3></Data>' UNION ALL
    SELECT something = 'third', xmlData = '<Data><e1>200</e1><e2>Nice</e2></Data>'         UNION ALL
    SELECT something = 'fourth', xmlData = '<Data><e1>404</e1><e2>Ho hum</e2><e3>Yopapa</e3></Data>'   UNION ALL
    SELECT something = 'fifth', xmlData = '<Data><e1>200</e1><e2>Nice</e2><e3>Fantastic</e3></Data>'   UNION ALL
    SELECT something = 'sixth', xmlData = '<Data><e2>Whoop Dee Do</e2><e3>Whoa!</e3></Data>';

    -- SEARCH PREDICATES
    DECLARE @ELEMENT_NAME1 VARCHAR(50) = 'e1';
    DECLARE @ELEMENT_VALUE1 VARCHAR(50) = '404';
    DECLARE @ELEMENT_NAME2 VARCHAR(50) = 'e3';
    DECLARE @ELEMENT_VALUE2 VARCHAR(50) = 'Yopapa';
    DECLARE @MATCHCOUNT  INT   = 2;

    -- SEARCH QUERY
    ;WITH BASE_DATA AS
    (
      SELECT
       T5.id
       ,T5.something
       ,TX.DATA.value('local-name(.)','varchar(50)')  AS ELEMENT_NAME
       ,TX.DATA.value('(./text())[1]','varchar(50)')  AS ELEMENT_VALUE
      FROM  dbo.TestXML05 T5
      OUTER APPLY T5.xmlData.nodes('/Data//*') TX(DATA)
    )
    SELECT
      BD.id
     ,BD.something
    FROM  BASE_DATA BD
    WHERE
      (
       BD.ELEMENT_NAME  = @ELEMENT_NAME1
      AND 
       BD.ELEMENT_VALUE  = @ELEMENT_VALUE1
      )
    OR
      (
       BD.ELEMENT_NAME  = @ELEMENT_NAME2
      AND 
       BD.ELEMENT_VALUE  = @ELEMENT_VALUE2
      )
    GROUP BY
      BD.id
     ,BD.something
    HAVING COUNT(*) = @MATCHCOUNT;
    ;


    Output
    id    something
    4    fourth

    Thank you very much. Like I said in another reply earlier, life would be perfect if I wouldn't have to hard code the query parameter values e1, and e3. However, I'm okay to also stipulate, client cannot pass more than 3 element values for filtering out data. After all, no point in being anal about things. So I will take time to research options this weekend and hopefully am able to arrive at reasonable solution.

  • Some progress! I feel I'm closer to the solution. Not using XMLTable since it's just more clutter than I need. The 'NodeRef' column in the results I have provided just for reference so I can see how the match is taking place.

    DECLARE @match TABLE (
        name VARCHAR (64),
        val VARCHAR (64)
    )
    INSERT @match SELECT name = 'e1', val = '404'
    INSERT @match SELECT name = 'e3', val = 'Yopapa'

    --INSERT @match SELECT name = 'e2', val = 'Nice'
    --INSERT @match SELECT name = 'e1', val = '200'
    --INSERT @match SELECT name = 'e3', val = 'Fantastic'

    SELECT * FROM @match

    SELECT M.*, c.value('local-name(.)','NVARCHAR(64)') AS NodeRef FROM dbo.Test M
    CROSS APPLY xmlData.nodes('/Data/*') T(c)
    JOIN @match p ON p.name = c.value('local-name(.)','NVARCHAR(64)') AND p.val = c.value('text()[1]','NVARCHAR(64)')

    20    first     <Data><e1>404</e1><e2>Dont Care</e2><e3>Yomama</e3></Data>    e1
    23    fourth    <Data><e1>404</e1><e2>Ho hum</e2><e3>Yopapa</e3></Data>       e1
    23    fourth    <Data><e1>404</e1><e2>Ho hum</e2><e3>Yopapa</e3></Data>       e3

    Still need to figure out how to eliminate duplicates. Would really appreciate if someone can beat me to it. You can tell from the hour of the day this has kept me up all night :sick:

  • Here is an alternative which counts and matches the value parameters passed, can be extended to tens of parameters without any column names.
    😎


    USE TEEST;
    GO
    SET NOCOUNT ON;
    -- https://www.sqlservercentral.com/Forums/2002704/Querying-records-on-multiple-XML-Node-elements
    IF OBJECT_ID(N'dbo.TestXML05') IS NOT NULL DROP TABLE dbo.TestXML05;
    CREATE TABLE dbo.TestXML05
    (
    id INT IDENTITY PRIMARY KEY,
    something VARCHAR (1024),
    xmlData XML
    );

    INSERT INTO dbo.TestXML05
    SELECT something = 'first', xmlData = '<Data><e1>404</e1><e2>Dont Care</e2><e3>Yomama</e3></Data>'  UNION ALL
    SELECT something = 'second', xmlData = '<Data><e1>500</e1><e2>Whoop Dee Do</e2><e3>Yikes</e3></Data>' UNION ALL
    SELECT something = 'third', xmlData = '<Data><e1>200</e1><e2>Nice</e2></Data>'         UNION ALL
    SELECT something = 'fourth', xmlData = '<Data><e1>404</e1><e2>Ho hum</e2><e3>Yopapa</e3></Data>'   UNION ALL
    SELECT something = 'fifth', xmlData = '<Data><e1>200</e1><e2>Nice</e2><e3>Fantastic</e3></Data>'   UNION ALL
    SELECT something = 'sixth', xmlData = '<Data><e2>Whoop Dee Do</e2><e3>Whoa!</e3></Data>';

    -- SEARCH PREDICATES
    DECLARE @ELEMENT_VALUE1 VARCHAR(50) = '404';
    DECLARE @ELEMENT_VALUE2 VARCHAR(50) = 'Yopapa';
    DECLARE @ELEMENT_VALUE3 VARCHAR(50) = NULL;

    -- SEARCH QUERY
    ;WITH PARAM_COUNT(PCNT) AS
    (
      SELECT ISNULL(SIGN(LEN(@ELEMENT_VALUE1)),0)
           + ISNULL(SIGN(LEN(@ELEMENT_VALUE2)),0)
           + ISNULL(SIGN(LEN(@ELEMENT_VALUE3)),0)
    )
    ,BASE_DATA AS
    (
      SELECT
       T5.id
       ,T5.something
       ,TX.DATA.value('(./text())[1]','varchar(50)')  AS ELEMENT_VALUE
      FROM  dbo.TestXML05 T5
      OUTER APPLY T5.xmlData.nodes('/Data//*') TX(DATA)
    )
    SELECT
      BD.id
     ,BD.something
    FROM  BASE_DATA BD
    CROSS APPLY PARAM_COUNT PC
    WHERE
      (
       BD.ELEMENT_VALUE  = @ELEMENT_VALUE1
      )
    OR
      (
       BD.ELEMENT_VALUE  = @ELEMENT_VALUE2
      )
    OR
      (
       BD.ELEMENT_VALUE  = @ELEMENT_VALUE3
      )
    GROUP BY
      BD.id
     ,BD.something
    HAVING COUNT(*) >= MAX(PC.PCNT);
    ;

  • Pagan DBA - Saturday, October 13, 2018 5:45 AM

    Some progress! I feel I'm closer to the solution. Not using XMLTable since it's just more clutter than I need. The 'NodeRef' column in the results I have provided just for reference so I can see how the match is taking place.

    DECLARE @match TABLE (
        name VARCHAR (64),
        val VARCHAR (64)
    )
    INSERT @match SELECT name = 'e1', val = '404'
    INSERT @match SELECT name = 'e3', val = 'Yopapa'

    --INSERT @match SELECT name = 'e2', val = 'Nice'
    --INSERT @match SELECT name = 'e1', val = '200'
    --INSERT @match SELECT name = 'e3', val = 'Fantastic'

    SELECT * FROM @match

    SELECT M.*, c.value('local-name(.)','NVARCHAR(64)') AS NodeRef FROM dbo.Test M
    CROSS APPLY xmlData.nodes('/Data/*') T(c)
    JOIN @match p ON p.name = c.value('local-name(.)','NVARCHAR(64)') AND p.val = c.value('text()[1]','NVARCHAR(64)')

    20    first     <Data><e1>404</e1><e2>Dont Care</e2><e3>Yomama</e3></Data>    e1
    23    fourth    <Data><e1>404</e1><e2>Ho hum</e2><e3>Yopapa</e3></Data>       e1
    23    fourth    <Data><e1>404</e1><e2>Ho hum</e2><e3>Yopapa</e3></Data>       e3

    Still need to figure out how to eliminate duplicates. Would really appreciate if someone can beat me to it. You can tell from the hour of the day this has kept me up all night :sick:

    I may be wrong, but once you have that output would it not be a case of using group by to eliminate the dups?
    With more rules for sure - on your current output I assume, maybe incorrectly, that lines 2 and 3 are dups - if so which one should be considered? the E1 or the E3?

  • Yes....I think you mean what I have below...and I think it works...

    SELECT M.id, M.something--, M.xmlData
    FROM dbo.Test M
    CROSS APPLY xmlData.nodes('/Data/*') T(c)
    JOIN @match p ON p.name = c.value('local-name(.)','NVARCHAR(64)') AND p.val = c.value('text()[1]','NVARCHAR(64)')
    GROUP BY M.id, M.something--, M.xmlData
    HAVING COUNT(*) = (SELECT COUNT(*) FROM @match)

    ...EXCEPT...the issue is I have to include all columns in the GROUP BY clause that I also need to return from my query. And when I uncomment M.xmlData from both places in above query, it gives me following error:

    Msg 305, Level 16, State 1, Line 169
    The XML data type cannot be compared or sorted, except when using the IS NULL operator
    .

    The whole point was to return the enter record including the xmlData column.  I'm worried if I grab just the Primary Key Id first and THEN again do a JOIN against entire table it will be terribly inefficient. Hoping there is an alternative to using GROUP BY. For now, below is "final" answer

    SELECT X.* FROM dbo.Test X
    JOIN (
        SELECT M.id
        FROM dbo.Test M
        CROSS APPLY xmlData.nodes('/Data/*') T(c)
        JOIN @match p ON p.name = c.value('local-name(.)','NVARCHAR(64)') AND p.val = c.value('text()[1]','NVARCHAR(64)')
        GROUP BY M.id
        HAVING COUNT(*) = (SELECT COUNT(*) FROM @match)
    ) Y ON X.id = Y.id

  • Pagan DBA - Saturday, October 13, 2018 6:37 AM

    Yes....I think you mean what I have below...and I think it works...

    SELECT M.id, M.something--, M.xmlData
    FROM dbo.Test M
    CROSS APPLY xmlData.nodes('/Data/*') T(c)
    JOIN @match p ON p.name = c.value('local-name(.)','NVARCHAR(64)') AND p.val = c.value('text()[1]','NVARCHAR(64)')
    GROUP BY M.id, M.something--, M.xmlData
    HAVING COUNT(*) = (SELECT COUNT(*) FROM @match)

    ...EXCEPT...the issue is I have to include all columns in the GROUP BY clause that I also need to return from my query. And when I uncomment M.xmlData from both places in above query, it gives me following error:

    Msg 305, Level 16, State 1, Line 169
    The XML data type cannot be compared or sorted, except when using the IS NULL operator
    .

    The whole point was to return the enter record including the xmlData column. Do I need to do another join somehow? And if so, how? I'm worried if I grab just the Primary Key Id first and THEN again do a JOIN against entire table it will be terribly inefficient. Hoping there is an alternative to using GROUP BY.

    Maybe something like the following - and depending on how many columns, their data and your requirements it may not be required to have all columns on the partition clause

    select *
    from (select t2.id
        , t2.something
        , t2.xmlData
        , t2.noderef
        , row_number() over (partition by t2.id
         , t2.something
         , convert(varchar(max), t2.xmlData)
         order by t2.noderef
         ) as rownum
       from (select m.id
          , m.something
          , m.xmlData
          , c.value ('local-name(.)', 'NVARCHAR(64)') as noderef
          from #test m
          cross apply xmlData.nodes('/Data/*') t (c)
          join @Match p
            on p.name = c.value ('local-name(.)', 'NVARCHAR(64)')
            and p.val = c.value ('text()[1]', 'NVARCHAR(64)')
        ) t2
    ) t3
    where t3.rownum = 1

  • @frederico_fonseca. Sorry, I didn't see your reply and edited my post. IMO, my solution is a little more readable. Unless you are saying your's is more efficient?

  • Can you please, in an unambiguous way, state the full requirements?
    😎 

    I've already posted a query that matches and exceeds the already stated requirements, are you not reading the posts?

  • Eirikur Eiriksson - Saturday, October 13, 2018 7:33 AM

    Can you please, in an unambiguous way, state the full requirements?
    😎 

    I've already posted a query that matches and exceeds the already stated requirements, are you not reading the posts?

    Err...Of course I am. It is because of all the replies I have been able to solve my problem. Please see my post which is the last on Page 1. I took all of you all's suggestions and then came up with my query.

    Now, in any event my original question was not clear....
    The situation I have is I need to track some "common" information and some "custom" information in a table. The former I called "something" - it can be one or more columns which don't change. The latter can be one or more data elements which I have called e1,e2,e3, etc...I dunno how many I will have. I also don't know how many of them will be needed on each record. I don't want to create a sparse table which I have to keep adding columns too. I also don't want to have table which stores Name-Value pairs (I've done this before and it gets tacky). So I decided to use XML to store the information.

    So what I needed was a way for clients to query records based on values based for any of e1, e2, e3, etc. which will not be known at run time. Like I said, I think with everyone's help I think I have a solution now. You will see my SQL has fewer lines that alternatives presented. So my last question was are the more verbose suggestions more efficient, in which case I will go with one of them. If not, I will stick with solution I came up with.

    Thank your for your time and help.

  • Pagan DBA - Saturday, October 13, 2018 9:57 AM

    Eirikur Eiriksson - Saturday, October 13, 2018 7:33 AM

    Can you please, in an unambiguous way, state the full requirements?
    😎 

    I've already posted a query that matches and exceeds the already stated requirements, are you not reading the posts?

    Err...Of course I am. It is because of all the replies I have been able to solve my problem. Please see my post which is the last on Page 1. I took all of you all's suggestions and then came up with my query.

    Now, in any event my original question was not clear....
    The situation I have is I need to track some "common" information and some "custom" information in a table. The former I called "something" - it can be one or more columns which don't change. The latter can be one or more data elements which I have called e1,e2,e3, etc...I dunno how many I will have. I also don't know how many of them will be needed on each record. I don't want to create a sparse table which I have to keep adding columns too. I also don't want to have table which stores Name-Value pairs (I've done this before and it gets tacky). So I decided to use XML to store the information.

    So what I needed was a way for clients to query records based on values based for any of e1, e2, e3, etc. which will not be known at run time. Like I said, I think with everyone's help I think I have a solution now. You will see my SQL has fewer lines that alternatives presented. So my last question was are the more verbose suggestions more efficient, in which case I will go with one of them. If not, I will stick with solution I came up with.

    Thank your for your time and help.

    Advice you to check the execution plan, the shorter code performs much worse than the code I posted 😉
    😎

Viewing 15 posts - 1 through 15 (of 24 total)

You must be logged in to reply to this topic. Login to reply