Indexing... Considerations for high (but predictable) INS/DEL activity.

  • The scenario...

    I've got a "queue" type table with > 10 million records. By nature, it gets about 50K rows INSERTED daily, rarely as much as 300K. Also daily, it gets about 50K DELETED.

    Another table references this first "queue" table via FK (no cascade), and those records have similar INS/DEL activity.

    Each table uses an integer clustered PK, and 4 or 5 non-clustered, non-unique indexes.

    The performance of these major INS/DEL operations is quite poor, presumably because of the index upkeep required. The current maintenance job DELETES records from both tables inside a for loop, in batches of 1000 records at a time to "reduce contention for the table". (I'm more of a "rip the band-aid off" kind of guy myself.) We're seeing only about 89 records/sec getting DELETEd using the current method. I saw 2100 records/sec DELETEd with the IX's dropped in a dev environment (and no production READ load).

    A senior developer here has indicated that we should consider tweaking the fill factor of the indexes in order to maintain high index performance for production operations AND get those major once-a-day INSERTS and DELETES to speed up.

    Questions...

    Is the 1000 row at a time batch idea worth anything?

    Is he right? Should I be looking at tuning the fill factor in this situation? Higher or lower than "stock"?

    Is there a better way altogether?

    Standing by... what other info can I offer?

  • Is the clustered index based on some sort of chronological entity that is also a factor of your deletes?

    For a scenario like you just described... rows inserted and rows aged off every day, I want the clustered index on a date column of some sort. My assumption that you delete these based on age of the row which is the most logical assumption. If that is not the case let us know... but even if not you can usually factor the age of the row (based on an ID or a datetime) into the age-off formula that allows you to efficiently and concurrently delete these rows.

    Batching the deletes is fine, but is especially necessary if the rows to be deleted are interspersed throughout the table (i.e not based on some chronological item that also defines the table organization).

    The probability of survival is inversely proportional to the angle of arrival.

  • The explicit DELETE statements...

    delete

    v

    from

    license_message_value v

    inner join license_message lm

    on v.license_message_id = lm.license_message_id

    where

    lm.archive_date < getdate()

    delete from

    license_message

    where

    archive_date < getdate()

    ... so that [archive_date] field is the crux of the biscuit. The [~_id] fields are merely integers, and aren't necessarily in order by the time the archive day rolls around. [archive_date] does have an IX, but I'd have to move mountains to get it into the clustered PK.

    I could use the cluster in a round-about way by keeping the batch setup in there, and instead of DELETEing the TOP 100 based on [archive_date] DESC, use the clustered PK DESC thusly...

    DELETE TOP (@batch) lm FROM license_message lm WHERE archive_date < getdate() ORDER BY license_message_id desc

    So, the WHERE asserts we're getting our proper rows, and the ORDER BY to fill the @batch each iteration through the for loop would be on the cluster.

    I might give it a shot.

  • Okay, I understand your set up now. The clustered index is on the id.

    I'd try to do it this way:

    --

    -- get a consistent snapshot of ids to be deleted

    -- with minimal footprint and best concurrency

    --

    select license_message_id into #deleteSnapshot

    from license_message with (NOLOCK)

    where archive_date < getdate()

    --

    -- delete the details rows:

    -- [this assumes you still want to batch it out].

    --

    set @cnt = 1

    while @cnt > 0

    begin

    delete top(1000) v

    from license_message_value v

    join #deleteSnapshot snap

    on v.license_message_id = snap.license_message_id

    set @cnt = @@ROWCOUNT

    end

    --

    -- Delete the parent rows

    --

    delete m

    from license_message m

    join #deleteSnapshot snap

    on m.license_message_id = snap.license_message_id

    You might even try putting an index on the temp table before you start using it:

    create clustered index IX_snapshot_id on #deleteSnapshot (license_message_id)

    The probability of survival is inversely proportional to the angle of arrival.

  • If I understand correctly, the license_message table has a:

    clus index (/PK) on license_message_id

    nonclus index on archive_date

    <other nonclus indexes>

    Yes, the clustering key for this table should be archive_date. Everything you do would likely run faster, often much faster, with the correct clus key. [Technically you could change the existing PK to be non-clus, then change the non-clus index on archive_date to be clus -- that is, change the clus column w/o changing the column that is the "official PK".]

    [Again, too many people insist on defaulting the clus key to an identity.]

    At any rate, if you must keep the original structure, you can still improve the DELETEs. No reason to do all those joins from the value table back to its parent.

    Naturally you can adjust the code below to work in batches, but I would do more than 1000 at a time; try 2,500.

    Btw, make sure the log file on the db has existing unused space to log the DELETEs. And a decent autogrow amount (and not a percentage, but a fixed amount). Part, perhaps most, of the reason for the lengthy DELETEs could be log growth.

    DECLARE @deleted_ids TABLE (

    license_message_id int

    )

    BEGIN TRY

    BEGIN TRANSACTION

    DELETE FROM

    dbo.license_message

    OUTPUT DELETED.license_message_id INTO @deleted_ids

    WHERE

    archive_date < getdate()

    DELETE FROM

    v

    FROM

    dbo.license_message_value v

    INNER JOIN @deleted_ids di ON

    v.license_message_id = di.license_message_id

    COMMIT TRANSACTION

    END TRY

    BEGIN CATCH

    IF @@TRANCOUNT > 0

    ROLLBACK TRANSACTION

    END CATCH

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • Yeah, that temp table approach is paying dividends.

    create table #batchlist (

    lm_id int not null primary key clustered

    )

    (It's faster with the explicit cluster, yes.) Haven't tried the table variable approach yet, but that OUTPUT statement is SLICK.

    Looks like the @batch sweet spot is going to come in between 500 and 4,000, so 2,500 was a good guess. I'm already seeing 4:1 improvement over the code I inherited just by using the temp table approach, so thanks again for making me look like a champ fellas.

  • Greg J (6/6/2012)


    Another table references this first "queue" table via FK (no cascade), and those records have similar INS/DEL activity.

    I'm curious to know why there isn't an ON DELETE CASCADE? You could then just DELETE TOP (x) FROM Parent WHERE archive_date < GETDATE(). I'm not saying it will necessarily be better (or that there aren't other things to consider here), just wondering about the reason.

  • ScottPletcher (6/6/2012)


    DECLARE @deleted_ids TABLE (

    license_message_id int

    )

    BEGIN TRY

    BEGIN TRANSACTION

    DELETE FROM

    dbo.license_message

    OUTPUT DELETED.license_message_id INTO @deleted_ids

    WHERE

    archive_date < getdate()

    DELETE FROM

    v

    FROM

    dbo.license_message_value v

    INNER JOIN @deleted_ids di ON

    v.license_message_id = di.license_message_id

    COMMIT TRANSACTION

    END TRY

    BEGIN CATCH

    IF @@TRANCOUNT > 0

    ROLLBACK TRANSACTION

    END CATCH

    That wouldn't work if there were FK from the license_message_value table id --> license_message table.

    Of course if it was a cascade delete that would be different.

    The probability of survival is inversely proportional to the angle of arrival.

  • Kiwi, coming into this problem, ON CASCADE DELETE was my first inclination, because I assumed the SQL engine streamlined the 2nd table delete somehow. But after some research, I found that the clause doesn't really yield any performance gains. Consensus that I gathered for CASCADEd deletes ranged from "eh, it's benign" over to "avoid if possible". There seemed to be no outright support for any reason that I cared about.

    Also, the code base I'm working on gets a lot of developer eyes on it in the span of a year, so I'd rather be explicit with the second DELETE in the case someone more shortsighted than myself inherits this code.

    sturner, you're right, and the order of my DELETEs is flipped.

  • That wouldn't work if there were FK from the license_message_value table id --> license_message table.

    I reversed the order of the table deletes for performance. By doing the controlling (in this case parent) table first, the deletes from the child table could be done without having to join back to the parent (to determine the date).

    That OUTPUT statement is SLICK.

    Yep, it's one of the really sweet things added to SQL :-).

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • ScottPletcher (6/7/2012)


    That wouldn't work if there were FK from the license_message_value table id --> license_message table.

    I reversed the order of the table deletes for performance. By doing the controlling (in this case parent) table first, the deletes from the child table could be done without having to join back to the parent (to determine the date).

    I think the point sturner was making was that your order of deletion would violate the foreign key constraint. If reversed (deleting child records first, and storing ids to delete from the parent) it seems problematic to ensure *all* child records that relate to a particular parent are deleted, especially when batching is desired.

    It's all a bit confusing though; the DDL earlier seemed to show archive_date present on both tables, for example.

  • Should I be looking at tuning the fill factor in this situation? Higher or lower than "stock"?

    Yes, it can be very applicable to non-clus indexes.

    You would lower the fillfactor, i.e., increase the freespace.

    The trick is figuring out the "best" fillfactor for each non-clus index -- getting a good initial value takes good knowledge and skill; likewise to refine it from there to the very "best" value.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • I think the point sturner was making was that your order of deletion would violate the foreign key constraint.

    That's a good point; yes, I wasn't sure if indeed such a FK actually existed.

    One could disable/drop the FK constraint during the DELETE process, then enable/recreate it after, depending on what the cost of that was.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • ScottPletcher (6/8/2012)


    I think the point sturner was making was that your order of deletion would violate the foreign key constraint.

    That's a good point; yes, I wasn't sure if indeed such a FK actually existed.

    One could disable/drop the FK constraint during the DELETE process, then enable/recreate it after, depending on what the cost of that was.

    Might not be a good idea if inserts are expected while also doing the deletes.

  • Might not be a good idea if inserts are expected while also doing the deletes.

    Perhaps, if you really normally do INSERTs that would violate the constraint. Hopefully that's extremely rare, and the constraint is more a DELETE preventative than an INSERT check (and presumably unauthorized DELETEs are also not allowed to occur (except by accident)).

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

Viewing 15 posts - 1 through 15 (of 20 total)

You must be logged in to reply to this topic. Login to reply