key lookup on select query with xml column

  • hey all, there is a table:

    create table dbo.mytable (

    colA int identity(1,1),

    colB varchar(50),

    colC bigint,

    colD xml)

    Primary key on (colA) and a covered non-clustered index for all columns except xml.

    select * from mtable and looking at execution plan show a key lookup for the xml column referencing back to primary clustered index.

    I created an xml index and then doing a select *, still shows key lookup. I would imagine that xml index are only used when doing xquery operations.

    How can you remove the key lookup step without removing the xml column from the query?

    Thanks

  • Pretty sure that this will happen regardless of what you do if you want to return X in the table without having an index that covers all X columns you want. So it's using the best available index you have, but since the one column isn't included in that same index it's doing a key lookup to retrieve the values. If it's a real simple table with no conditions I'm surprised it's not just doing a table scan but I'm guessing there is more to the story.

    └> bt



    Forum Etiquette: How to post data/code on a forum to get the best help[/url]

  • The screen didnt let me add the xml column to the include part of the nonclustered index.

    So I scripted it out then added it manually, this created the index.

    Is this a bug in SSMS 2008 gui?

    Then doing the select query removed the key lookup and the decreased num of pages read, which is good.

    So now the question is, whats the difference in terms of insert/update/delete operations on an xml column in the include part of nonclustered index VS. primary xml index?

    This table has 1 million+ rows with an average xml size of 1 mbyte based on datalength().

  • My understanding is that the index will be less efficient for data insertions/updates/deletions because the values of the included columns are stored at the leaf level, but a potential query on the data will be much faster because the values of the included columns are stored at the leaf level.

    In other words, whether or not it's worth it really depends on you environment and the trade-off between write performance and query performance.

    └> bt



    Forum Etiquette: How to post data/code on a forum to get the best help[/url]

  • Looking at the results below for the include part of the index;

    --with xml

    table1255203 2627752 KB1294992 KB1332208 KB 552 KB

    --without xml

    table1255203 1333536 KB1294992 KB38048 KB 496 KB

    The index size has increased from 38 mbytes to 1.3 gbytes based on the inclusion of the xml column in the nonclustered index.

    A simple insert on this table took 16 sec vs 1 sec (without xml column).

    I'm going to stick without the xml column in nonclustered index.

  • msarikas (6/2/2010)


    I created an xml index and then doing a select *, still shows key lookup.

    If you mean you created a PRIMARY XML index, then yes, you still get a lookup because a PRIMARY XML index is just a shredded representation of the XML stored in a separate (internal) table. The PRIMARY XML index is the clustered index on that 'hidden' table.

    See Indexes on XML Data Type Columns

    I would imagine that xml index are only used when doing xquery operations.

    XML methods including XQuery, yes.

    How can you remove the key lookup step without removing the xml column from the query?

    You could add the XML column as an INCLUDEd column in a non-clustered index, but that would be relatively unusual, unless you have a specific need to return the whole XML structure typed as XML.

    If the XML data is large, you would probably never want to INCLUDE it in a non-clustered index.

  • msarikas (6/2/2010)


    The screen didnt let me add the xml column to the include part of the nonclustered index.

    So I scripted it out then added it manually, this created the index.

    Is this a bug in SSMS 2008 gui?

    Yes.

    Then doing the select query removed the key lookup and the decreased num of pages read, which is good.

    Only because the clustered index contains every column. I can't imagine the number of pages read decreased by very much at all.

    So now the question is, whats the difference in terms of insert/update/delete operations on an xml column in the include part of nonclustered index VS. primary xml index?

    Nothing special just because it is XML. Like any other INCLUDEd column (except in the special case where you try to INCLUDE a column that forms part of a non-unique cluster key) a copy of the data is stored at the lowest (leaf) level of the non-clustered index. Changes will typically be fully logged, so UPDATE operations that affect the XML column would write the data at least three times - (1) to the table itself (2) to the index (3) to the log (for re-do) and (4) the previous value will be written to the log in case the change needs to be undone.

    This table has 1 million+ rows with an average xml size of 1 mbyte based on datalength()

    Storing 1MB of XML per row sounds like a highly unusual design. Are you sure this is unavoidable?

    If you really absolutely must store that much data per row, you should at least consider FILESTREAM storage.

    The right choice depends on a whole load of information only you possess.

  • Sorry to piggy back on this old thread, but I had a question and couldn't seem to find any other information about it.

    Like the OP I have the Key Lookup on the XML column. SQL Kiwi mentioned that this happens because the Primary XML is on the table but no secondary XML. I have thesame thing, but I haven't tested or found anything that stated that adding secondary XML indexes would solve this.

    If I put a secondary xml for value on the table and I'm doing value queries - will that stop the key lookups?

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply