Which index will perform better

  • I have a table with about 12M records. Between the columns, there is a datetime field D and a bit field B. I need to delete records with B = 1. There is an assumption that records with B = 1 will always have D set to a future date.

    Which will work faster:
    1. Having an index on B and doing DELETE WHERE B = 1 
    2. Having and index on D and B and doing DELETE WHERE D > GETDATE() AND B = 1
    ?

  • Have you done any testing?  I'd be concerned about your assumption as well.  If that doesn't prove true or you delete runs and fails (i.e. blows out the log), then you will have records that should be deleted and aren't.  If it is true, then the date would be better in the index because of cardinality .  Also, I don't know how big or how many records you would be deleting or how volatile the table is, but I would think about putting it in a loop and deleting top (N) so you won't run the rick of blowing out the log or locking the table

    For better, quicker answers, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Eliyahu - Tuesday, January 15, 2019 4:13 AM

    I have a table with about 12M records. Between the columns, there is a datetime field D and a bit field B. I need to delete records with B = 1. There is an assumption that records with B = 1 will always have D set to a future date.

    Which will work faster:
    1. Having an index on B and doing DELETE WHERE B = 1 
    2. Having and index on D and B and doing DELETE WHERE D > GETDATE() AND B = 1
    ?

    My recommendation would be to look into filtered indexes.  I also wouldn't do a direct delete.  Instead, I'd do a search for the bit/date (whatever) and get a listing of all the PK values of rows that needed to be deleted and use that as my driver (as a join) for the deletes.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Agree, a filtered index if possible.

    Then lock down the requirements.  If you need to delete all B = 1, whether the date is a future date or not (i.e. you don't need to assume they all are), then base the DELETE on the B column alone.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • Jeff Moden - Tuesday, January 15, 2019 7:27 AM

    ...I also wouldn't do a direct delete.  Instead, I'd do a search for the bit/date (whatever) and get a listing of all the PK values of rows that needed to be deleted and use that as my driver (as a join) for the deletes.

    I'm curious to learn more about why you advise this? Can you elaborate? Thanks!


    "If I had been drinking out of that toilet, I might have been killed." -Ace Ventura

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply