Guidance on table valued types.

  • We have 1 table valued parameter (tvp) coming to GetXXX stored procedures. It contains only 1 column of type uniqueidentifier. Currently, it does not have any index on it. For performance improvement, I was thinking of adding a primary key as clustered index on it. My initial testing showed if there less than 100 records in that tvp then it actually performs worse than the older one which doesn't have any index on it. I was hoping that it would perform same if not better for smaller datasets. Does anybody know if I'm missing anything here.

    Thanks in advance.

    Regards,

    Mayur

  • Hi,

    Did you check the query plan? I'm quessing the new index is not being used. Can you give an example on the query you are using on that table?

    Regards!

  • From the plan, it can be seen that the new index is used. It also saves an extra Sort operation on the tvp.

  • Hmm, may not be worth it, you know. There is only one column in the table. If you had more columns and if filtering was performed on the key column - yes. As you said, the data will be sorted because of the clustered index, but the sorting happens probably when the SP call is made. You may not benefit from this.

    https://sqlroadie.com/

  • I recommend having a clustered PK on all TVPs. I enforce this in code reviews too. 1) it keeps data transfer to a minimum by forcing sloppy development from passing more data than necessary and 2) having a clustered index on the TVP will help the engine (although not guarantee it can) maintain a consistent data access pattern which helps reduce deadlocks.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • mayur birari (2/28/2013)


    From the plan, it can be seen that the new index is used. It also saves an extra Sort operation on the tvp.

    HOW is the index ACTUALLY being used? Just because it may be an INDEX SEEK doesn't necessarily mean that it will be more effective. You have to look at the properties of the seek. How many times is the seek actually occuring? For example, 40,000 INDEX SEEKs behind the scenes can be quite a bit slower than a single scan at the leaf level.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (3/3/2013)<snip/> For example, 40,000 INDEX SEEKs behind the scenes can be quite a bit slower than a single scan at the leaf level.

    Jeff, in this case, won't optimizer choose to scan rather than to seek? Will optimizer seek blindly because the index is covering? Or does it depend?

    https://sqlroadie.com/

  • One other question came to mind: In what order are the uniqueidentifiers being added to the TVP as they are sent to the SQL Server? If you're using .NET then is it safe to assume the rows are in "any old order" in your DataTable or can you confirm they are sorted?

    And a bit more on this:

    My initial testing showed if there less than 100 records in that tvp then it actually performs worse than the older one which doesn't have any index on it. I was hoping that it would perform same if not better for smaller datasets.

    The point where the performance gain the proc has with the index in place outweighs the overhead the engine imposes having to sort the rows as they are added to the Table-type sounds like it is at ~100 rows. Like I said though, the hint you give the engine by having the clustered index in place in terms of which order to process the data when it comes to joins will help you avoid deadlocks in concurrency-scenarios.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Arjun Sivadasan (3/3/2013)


    Jeff Moden (3/3/2013)<snip/> For example, 40,000 INDEX SEEKs behind the scenes can be quite a bit slower than a single scan at the leaf level.

    Jeff, in this case, won't optimizer choose to scan rather than to seek? Will optimizer seek blindly because the index is covering? Or does it depend?

    It'll depend. The execution plan will compile according to assumptions based on statistics. If they're wrong... well, bad things happen.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Thanks Craig, such has been my experience as well.

    https://sqlroadie.com/

  • Thanks everybody for your responses. I could not post the table structure etc because of policies. Here is what I found out after more testing.

    From the execution plans, it looks like the SQL server has to use all the records from the tvp anyways as it does index seeks against the main physical table. That is why having an index on tvp does not matter.

    There is one thing to notice here and that is how do you join the tvp against the physical table:

    1. With the non-indexed tvp, when I use where exists() to filter records from the physical table, then the SQL Server does add Sort() in the execution plan after the Table scan of the tvp . If I convert it inner join then it removes that Sort().

    2. With the indexed tvp, it does need that Sort() even if where exists() is used.

    So when inner join is used, both non-indexed and indexed tvps perform same.

    So I think the only advantage that an index gives is the uniqueness of the records coming in. The "Ordered" part of the index does not come in handy. I tried this out with the inserts as well and found out the same thing.

    Let me know what do you guys thing.

    Thanks,

    Mayur

  • mayur birari (3/4/2013)


    Thanks everybody for your responses. I could not post the table structure etc because of policies. Here is what I found out after more testing.

    From the execution plans, it looks like the SQL server has to use all the records from the tvp anyways as it does index seeks against the main physical table. That is why having an index on tvp does not matter.

    There is one thing to notice here and that is how do you join the tvp against the physical table:

    1. With the non-indexed tvp, when I use where exists() to filter records from the physical table, then the SQL Server does add Sort() in the execution plan after the Table scan of the tvp . If I convert it inner join then it removes that Sort().

    2. With the indexed tvp, it does need that Sort() even if where exists() is used.

    So when inner join is used, both non-indexed and indexed tvps perform same.

    So I think the only advantage that an index gives is the uniqueness of the records coming in. The "Ordered" part of the index does not come in handy. I tried this out with the inserts as well and found out the same thing.

    Let me know what do you guys thing.

    Thanks,

    Mayur

    We said the unique constraint helps defend against duplicates from entering the mix due to sloppy development and gives the optimizer more information with which to base its decisions. In your specific case maybe uniqueness is all you will benefit from initially however an index comes as a package deal with a unique constraint like a primary key. In your specific case you may not see a benefit with or without the index in your join however if the plan ever changes due to changes in the data distribution, or possibly the introduction of a new index definition, then sorting requirements may also change on either side of the join and at that point the index on the TVP may come in handy in guiding the engine into a consistent data access pattern across concurrent requests. This is where you will help the engine avoid deadlocks.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Thanks opc.

    I missed the point with the deadlock in-case of inserts. This is an important factor in case of high transaction environment.

    Thanks again.

  • Arjun Sivadasan (3/3/2013)


    Jeff Moden (3/3/2013)<snip/> For example, 40,000 INDEX SEEKs behind the scenes can be quite a bit slower than a single scan at the leaf level.

    Jeff, in this case, won't optimizer choose to scan rather than to seek? Will optimizer seek blindly because the index is covering? Or does it depend?

    Like Craig said, it depends but you do have to look.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hi @opc,

    I did some testing with the indexed tvps interms of deadlocks. I did not find any difference between non-indexed and indexed versions. I created a simple application which spaws n number of threads and then each thread would insert m number of items into the tables. The surrogate PK in our database is Guid and we generate it in .Net with Guid.NewGuid(). The deadlocks that occur are index locks mainly because of the foreign checks.

    Regards,

    Mayur

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply