XML Noob Question

  • I'm working through the 70-461 training book, and have gotten to the chapter about XML. If I'm querying element centric XML would the following code be the most efficient in finding first and last name of people that live in Chicago? Secondly, do I always have to declare a namespace in order to search a column with multiple XML documents?

    WITH XMLNAMESPACES('http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/Resume' AS ns)

    SELECT [Resume].value('(/ns:Resume/ns:Name/ns:Name.First)[1]', 'varchar(50)') AS Firstname,

    [Resume].value('(/ns:Resume/ns:Name/ns:Name.Last)[1]', 'varchar(50)') AS Lastname

    FROM HumanResources.JobCandidate

    WHERE [Resume].exist('/ns:Resume/ns:Employment/ns:Emp.Location/ns:Location/ns:Loc.City[.="Chicago"]')=1

    Any help would be appreciated... XML is kind of confusing, trying to wrap my mind around it.

  • rho_pooka (2/19/2014)


    I'm working through the 70-461 training book, and have gotten to the chapter about XML. If I'm querying element centric XML would the following code be the most efficient in finding first and last name of people that live in Chicago? Secondly, do I always have to declare a namespace in order to search a column with multiple XML documents?

    To break down your first question. Is this the most efficient? I'm not sure if this is more efficient. I do think that this is the easiest to write and understand. Someone else may be able to answer the efficiency question. There are several alternates to this query, but the are complicated.

    Is this a way to get the first and last name? This is a good way to get the first and last name.

    Your second question. The namespace has absolutely nothing to do with multiple documents (you are dealing with only one here). The namespace is a part of the identity of the elements and attributes of the document.

    There are several analogies that one can use for namespace. First if you are a .Net developer, the meaning of namespace here is very similar to the use of the term in .Net. For example, there are all kinds of Property classes out there. Many have similar structures, many are different. How do you distinguish them? You put identify each of the Property classes in a namespace. The Property class that is in one namespace is intended to be totally independent of a Property class in another namespace. They are not the same even though their Property class name is different.

    .Net namespaces might not work for you. How about SQL? How do you interpret an Amount column? What is the data type of the Amount column? If I had a select statement like the following, do you expect it to work?

    Select Amount;

    Amount is meaningless outside of the table that it belongs to (and the table structure is specific to a schema and database). Amount has meaning within the namespace of the table (metaphorically speaking).

    In the XML that is in the Resume column... That is meaningless by the way, you could put an integer in the Resume column or a varbinary(max) containing a PDF. However, in the JobCandidate table (in the HumanResources table, all metaphorically namespaces that define the Resume column), the Resume column is XML. All of the elements in the Resume column are defined within a specific namespace: http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/Resume. There is no Resume element in this XML document. There is a Resume element that is in the "http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/Resume" namespace. None of the other elements exist in this XML document (as far as I remember) outside of the "http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/Resume" namespace.

    One point I missed. How do I know that all of the elements are in the "http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/Resume" namespace. If you view the Resume column, you will see something like this xmlns:ns="http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/Resume". That is interpreted as "Treat every element in this document that has the prefix "ns:" as belonging to the 'http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/Resume' namespace." Whoever put together the query you posted also used the "ns:" prefix.

    By the way, the following two queries works:

    WITH XMLNAMESPACES

    ('http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/Resume' AS peach)

    SELECT

    [Resume].value('(/peach:Resume/peach:Name/peach:Name.First)[1]', 'varchar(50)') AS Firstname,

    [Resume].value('(/peach:Resume/peach:Name/peach:Name.Last)[1]', 'varchar(50)') AS Lastname

    FROM HumanResources.JobCandidate

    WHERE [Resume].exist

    ('/peach:Resume/peach:Employment/peach:Emp.Location/peach:Location/peach:Loc.City[.="Chicago"]')

    =1;

    WITH XMLNAMESPACES(default 'http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/Resume' )

    SELECT [Resume].value('(/Resume/Name/Name.First)[1]', 'varchar(50)') AS Firstname,

    [Resume].value('(/Resume/Name/Name.Last)[1]', 'varchar(50)') AS Lastname

    FROM HumanResources.JobCandidate

    WHERE [Resume].exist('/Resume/Employment/Emp.Location/Location/Loc.City[.="Chicago"]')=1

    In the first, I am using "peach:" to identify the namespace. In the second, I am saying that if there is no prefix, then treat the element name as belonging to the namespace above.

    The following queries don't work:

    SELECT

    [Resume].value('(/peach:Resume/peach:Name/peach:Name.First)[1]', 'varchar(50)') AS Firstname,

    [Resume].value('(/peach:Resume/peach:Name/peach:Name.Last)[1]', 'varchar(50)') AS Lastname

    FROM HumanResources.JobCandidate

    WHERE [Resume].exist

    ('/peach:Resume/peach:Employment/peach:Emp.Location/peach:Location/peach:Loc.City[.="Chicago"]')

    =1;

    SELECT [Resume].value('(/Resume/Name/Name.First)[1]', 'varchar(50)') AS Firstname,

    [Resume].value('(/Resume/Name/Name.Last)[1]', 'varchar(50)') AS Lastname

    FROM HumanResources.JobCandidate

    WHERE [Resume].exist('/Resume/Employment/Emp.Location/Location/Loc.City[.="Chicago"]')=1

    The first doesn't work because it doesn't know how to interpret the "peach:" prefix. In the second, it treats the element names as belonging to some global namespace (or no namespace at all, I'm fuzzy on that point). The Resume element in the second query is not the Resume element in the "http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/Resume" that is for sure and thus it doesn't match any element that is in the XML document that is in the Resume column.

    I hope that this clears some of this up. My long words may have muddied it more, please submit follow up questions.

    Russel Loski, MCSE Business Intelligence, Data Platform

  • First off, thank you for such a detailed response! I'm in awe of your reply. That's a lot to digest and need to read through your reply multiple times, try the code, and make sense of it all. I'll try to reply when I understand this more clearly.

    Thanks again!

    Ben

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply