Optimizing inserts and deletes on a single large table

  • Our company has a very large table (150 million rows) to which we add 1000 rows per minute via a trigger on another table. At the same time, we are deleting older rows through a job which runs every few minutes. I am wondering if anyone would have any recommendations for ensuring that the delete process and insert trigger do not lead to much contention. THe rows that we delete are at the other end of the table, so to speak, from the rows that we are inserting. But occasionally we experience some sluggishness in the whole process.

    We are using Sql Server 2005.

  • You would probably want to partition this table across multiple Volumes, additionally insuring that none of these volumes shared any physical disks. You should probably use a partitioning function that put the new instered records on a different volume(s) than the old to-be-deleted records.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Partioning is a good idea, but in addition is it worth it to try to put in hints like WITH ROWLOCK on the transactions that delete and insert, or is that not going to make much difference?

  • I doubt that ti would make much difference. Though you could always test it to see for you specific case.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Forgot to mention that we are using Standard Edition and Not Enterprise, so I guess the partitioning of the table is out, right, or rather it would have to be simulated with a partition view??

  • Yes - partitioning is an Enterprise-only feature. You could indeed consider using a partitioned view, but you should probably take a step back before making such a big change. One very sound approach is to identify bottlenecks in the existing system, and then make changes to improve matters. The key is to work based on evidence rather than simply applying whatever 'best practice' comes to hand.

    The subject is a complex one, and there is nowhere near enough detail in your posts to make any recommendations. For example, if your system is constrained by logging or free buffer pool pages, rather than data file I/O, partitioning tables and placing them on separate physical disks is highly unlikely to make much of an impact.

    The best papers available, to my knowledge, discussing how to approach this sort of project are linked to below - both feature senior figures from Microsoft. The basic strategy is to measure the 'waits and queues' over the system as a whole, and then apply targeted measures to improve those things most in need.

    Troubleshooting Performance Problems in SQL Server 2008

    SQL Server 2005 Waits and Queues

  • What you need to do is find out WHY things appear to be sluggish at times. Is it due to blocking or is it due to IO stalls or some other form of wait? Each of those possibilities should be analyzed while the activities are in progress. If for some reason (such as complexity of the query leading to bad estimation, improper indexing) table locks (or perhaps even page locks) are being acquired then you could indeed be seeing blocking and if so row lock hints could be a help. If the queries that insert/delete are suboptimal then whatever locks are taken are being held too long and tuning could be the magic bullet you seek.

    LOTS of possibilities here. I recommend getting a tuning consultant in for a few hours remote work. Should be all it takes, and you can learn from how they do their work as well so you can better fix your own problems in the future.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Any idea how many rows are normally deleted by your delete process?

    If it's more than about 5000, then you're likely to be getting table locks due to lock escalation, in which case it may be worth considering amending your delete to remove the records in batches of, say, 4000 which should at least prevent this issue

  • Andrew Gothard-467944 (12/14/2009)


    Any idea how many rows are normally deleted by your delete process?

    If it's more than about 5000, then you're likely to be getting table locks due to lock escalation, in which case it may be worth considering amending your delete to remove the records in batches of, say, 4000 which should at least prevent this issue

    Can I ask where the 5000 number you pick came from?

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Good question. I'll have to have a dig there.

    It was from a MS article on lock escalation to the effect that threshold for Lock Escalation was c. 3000 rows for 2000, c. 5000 for 2005 - but you're quite right, it's pretty bad form really to post without reference to the appropriate source. Will try to rectify

    cheers

    Andrew

  • Paul White (12/14/2009)


    http://blogs.msdn.com/sqlserverstorageengine/archive/2006/05/17/Lock-escalation.aspx (5000 held locks, approximately)

    Assuming good cardinality then there could be 5000 or so PAGES locked if the engine decided that page locks were appropriate (which is likely for any reasonable rows estimated that don't automatically trigger a table lock). That is many more rows than 5000 stated as the break point. Good number, just variable thingy locked. 🙂

    But again we are back to the query that does the work and the estimates the optimizer works with and finally decides for the actual INSERT and especially DELETE queries. OP?? Any followup here??

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • TheSQLGuru (12/14/2009)


    Assuming good cardinality then there could be 5000 or so PAGES locked if the engine decided that page locks were appropriate (which is likely for any reasonable rows estimated that don't automatically trigger a table lock). That is many more rows than 5000 stated as the break point. Good number, just variable thingy locked. 🙂

    Oh yes absolutely. I just posted the link to illustrate where the 5000 number came from.

    TheSQLGuru (12/14/2009)


    But again we are back to the query that does the work and the estimates the optimizer works with and finally decides for the actual INSERT and especially DELETE queries. OP?? Any followup here??

    With any luck he's reading about waits and queues or hiring a consultant 😉

  • Have read the paper SQL Server 2005 Waits and Queues, which I guess is a locus classicus of the waits and queues methodology. The delete statement, BTW, is a DELETE FROM table WHERE ID IN (SELECT TOP 3000 ID FROM table). In themselves, the deletes are very fast. The Inserts are ordinary row level (fired by a trigger) inserts, but 1000 per minute are arriving continuously Finally, should mention that we had to add memory to our system because SS was wanting to use more than we had, so some significant sluggishness was due to having the server page to disk..

  • billmcknigh (12/14/2009)


    Have read the paper SQL Server 2005 Waits and Queues, which I guess is a locus classicus of the waits and queues methodology. The delete statement, BTW, is a DELETE FROM table WHERE ID IN (SELECT TOP 3000 ID FROM table). In themselves, the deletes are very fast. The Inserts are ordinary row level (fired by a trigger) inserts, but 1000 per minute are arriving continuously Finally, should mention that we had to add memory to our system because SS was wanting to use more than we had, so some significant sluggishness was due to having the server page to disk..

    Hmm, that DELETE seems curious. You have no order by for the top, so you could get ANY 3000 rows, right? And regardless of that (which is certainly not what I would want to be doing in almost all cases I can construct), why not use this syntax:

    DELETE TOP 3000 FROM table

    But I would really like to know why you don't care what 3000 rows get deleted.

    As for the INSERTs, are you saying that 1000 inserts are done individually to another table and that causes a trigger to fire that inserts each one of those rows into this other table that has the DELETEs?

    Do you get blocking much?

    Have you checked for what locks are being held for the table?

    Run perfmon checking for locking counts/durations?

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply