December 21, 2007 at 11:11 am
ok, here it goes...I have been working this issue for several days and can't seem to limit or eliminate the blocking.
We have a Db with several "CallDetail" tables that are written to at a rate of 10 records a second in peak times. The Db has about 100,000,000 rows that we want to keep. We purge them nightly to get rid of old data and this is when the trouble starts. We get blocking all over...
We insert data from 30 different applications using 30 different client connections.
When the delete runs it blocks up the inserts.
The inserts then write to a file to hold the data till it can be processed. preventing data loss but using allot of system resources.
The delete was using (nolock) but i found it MUCH faster to lock it briefly while it pulls a list of old data.
The table is configured as follows:
CREATE TABLE [dbo].[AttachdataDetail] (
[connid] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
[key1] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[SALESPROD_1] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[SALESPROD_2] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[SALESPROD_3] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[SALESPROD_4] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[RET_EW_CODE] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[BT_ALT] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[REPRICE_ALT] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[PARTNER_ALT] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[PRIVATE_LABEL] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[OFFER_SERVICE_1] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[OFFER_SERVICE_2] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[AUTH_PROHIB_STAT] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[BANKRUPT_STAT] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[CLOSED_STAT] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[REVOKED_STAT] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[FROZEN_STAT] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[INT_ACCU_STAT] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[LOST_STAT] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[STOLEN_STAT] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[CHARGEOFF_STAT] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[DELINQUENT_STAT] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[OVERLIMIT_STAT] [char] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[OD_STAT] [char] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[VIP_ALT] [char] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[RET_SCORE] [char] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[ANI_NEG] [char] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[MRK_INITIATIVE1] [char] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[MRK_INITIATIVE2] [char] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[VERIFICATION_ST] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[CALL_REASON] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[time_stamp] [datetime] NOT NULL
) ON [PRIMARY]
GO
CREATE INDEX [Key] ON [dbo].[AttachdataDetail]([connid], [key1]) ON [PRIMARY]
GO
The inserts use the following SP.
CREATE PROCEDURE dbo.sp_WriteAttachDetailRecord
(
@connidvarchar(30)='no data',
@key1varchar(20)='no data',
@SALESPROD_1 varchar(20)='no data',
@SALESPROD_2 varchar(20)='no data',
@SALESPROD_3 varchar(20)='no data',
@SALESPROD_4 varchar(20)='no data',
@RET_EW_CODE varchar(20)='no data',
@BT_ALT varchar(20)='no data',
@REPRICE_ALTvarchar(20)='no data',
@PARTNER_ALTvarchar(20)='no data',
@PRIVATE_LABEL varchar(20)='no data',
@OFFER_SERVICE_1 varchar(20)='no data',
@OFFER_SERVICE_2 varchar(20)='no data',
@AUTH_PROHIB_STAT varchar(20)='no data',
@BANKRUPT_STAT varchar(20)='no data',
@CLOSED_STAT varchar(20)='no data',
@REVOKED_STAT varchar(20)='no data',
@FROZEN_STAT varchar(20)='no data',
@INT_ACCU_STAT varchar(20)='no data',
@LOST_STAT varchar(20)='no data',
@STOLEN_STAT varchar(20)='no data',
@CHARGEOFF_STAT varchar(20)='no data',
@DELINQUENT_STAT varchar(20)='no data',
@OVERLIMIT_STAT varchar(20)='no data',
@OD_STAT varchar(20)='no data',
@VIP_ALT varchar(20)='no data',
@RET_SCORE varchar(20)='no data',
@ANI_NEG varchar(20)='no data',
@MRK_INITIATIVE1 varchar(20)='no data',
@MRK_INITIATIVE2 varchar(20)='no data',
@VERIFICATION_ST varchar(20)='no data',
@CALL_REASON varchar(20)='no data'
)
As
BEGIN
INSERT INTO dbo.AttachdataDetail with (rowlock) values
(
@connid,
@key1,
@SALESPROD_1,
@SALESPROD_2,
@SALESPROD_3,
@SALESPROD_4,
@RET_EW_CODE,
@BT_ALT ,
@REPRICE_ALT,
@PARTNER_ALT,
@PRIVATE_LABEL,
@OFFER_SERVICE_1,
@OFFER_SERVICE_2,
@AUTH_PROHIB_STAT,
@BANKRUPT_STAT,
@CLOSED_STAT,
@REVOKED_STAT,
@FROZEN_STAT,
@INT_ACCU_STAT,
@LOST_STAT,
@STOLEN_STAT,
@CHARGEOFF_STAT,
@DELINQUENT_STAT,
@OVERLIMIT_STAT,
@OD_STAT,
@VIP_ALT,
@RET_SCORE,
@ANI_NEG,
@MRK_INITIATIVE1,
@MRK_INITIATIVE2,
@VERIFICATION_ST,
@CALL_REASON,
getdate()
)
END
GO
The Deletes use the following:"I have been through MANY different versions but this one is the most efficient I have been able to build."
CREATE PROCEDURE temp_purge_attachdatadetail_data AS
begin
declare @count int
set @count=0
insert into delete_connids_attach select connid from attachdatadetail (tablock) where time_stamp < getdate()-94
select @count=count(1) from delete_connids_attach (tablock)
while @count>0
begin
select top 100000 connid into #connif1223 from delete_connids_attach
begin transaction t1
delete from attachdatadetail with (rowlock) where connid in
(select connid from #connif1223);
commit transaction t1
begin transaction t2
delete from delete_connids_attach with (tablock)where connid in
(select connid from #connif1223);
commit transaction t2
drop table #connif1223
select @count=count(1) from delete_connids_attach (tablock)
end
end
GO
Thank you for your time,
Troy Doherty
Telephony Information Services
JPMorganChase&Co.
December 21, 2007 at 11:33 am
Troy,
I sympathize and I'd really recommend looking at 2005, even in 8.0 mode. The architectural enhancements will help with the load, especially if you can move to 64-bit.
That aside, a couple things I'd suggest.
1. I'd calculate the date of deletion and use that as a scalar.
select @dt = datediff( d, -94, getdate())
delete ... where dt < @dt
this gives you some efficiencies as the engine doesn't use a function in the query. That might help slightly.
2. If this is not a critical process, meaning that you're trimming rows to manage data, what about breaking into batches? You're storing off ids anyway, so you only delete xx rows in a transaction. I'd recommend you delete specifc rows, so
select top 10 connid into Deletetable from #connif1223
delete ad
from attachdatadetail ad with (rowlock)
inner join DeleteTable d
on ad.connid = d.connid
and repeat. You can manage this as a transaction or not and record whether the deletes proceed.
I also dislike temptables for a regular process like this. There have been issues in the past and I prefer to manage this with permanent tables that I process and handle myself. That way if the process explodes, you don't have a huge rollback on your hands either, no large transaction hit, etc. Mark the IDs that get deleted in the table even. We use this for our email system here. We mark xx rows as working, then grab them and perform and action. If it works, we mark them again as complete. If something blows up, like a particular machine, we can see where we are in the process.
Good luck. Hope you don't get called for this next week.
December 21, 2007 at 11:35 am
The problem is that you never give the system time to "breath"... sure, it only takes about 3 seconds to delete 100,000 rows provided you don't have Delete Triggers on the table and a whole bunch of DRI... but you have a tight loop that's almost as fast as just deleting all of the rows.
I'd recommend cutting down the number of rows to be deleted so each individual delete only takes 1 second or so. AND, you just also add a delay so the insert mechanism can get a word in edge wise. To add the delay, you would use something like the following...
WAITFOR DELAY '00:00:10'
That will give you a delay of ten seconds... change to your liking but 5 or 10 seconds will allow the insert process to clear it's guns.
The other thing to consider is instead of using the overhead of an explicit transaction, try using the table hint of WITH (TABLOCKX)... it's nasty fast.
--Jeff Moden
Change is inevitable... Change for the better is not.
December 21, 2007 at 11:55 am
If you do what Steve mentioned and use a "real" table to hold the rows you want to delete you could run the select in one pass and then have another process that does the deletes, instead of doing it all in 1 procudeure.
When I have had to run deletes based on a date on busy tables I have used a scheduled job that would run more frequently and delete smaller batches. Something like:
Set @keep_date = getdate() - 94
If Exists(select * from attachdatadetail where time_stamp < @keep_date)
Begin
Delete A
From
attachdatadetail A Join
(Select Top 100000 connid from attachdatadetail where time_stamp < @keep_date) B On
A.connid = B.connid
End
Odds are I would be running it often enough that I would not need the Top in the subquery
Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
December 21, 2007 at 11:59 am
Steve, Jeff,
Thank you both for your quick replies. I will be trying a combination of your suggestions after the holidays. I'd hate to break something for Christmas.
I prefer not to use temptables as well. The previous versions were VERY temp heavy. Even on a beefy machine we ran into memory and cpu issues.
I will post my changes and results.
Troy Doherty
December 21, 2007 at 12:42 pm
Do you have any indexes other than the one you posted? Especially, do you have a clustered index?
If you don't have a cluster, I would suggest a clustered index on the timestamp column, then do the deletes like Steve suggested.
Since the clustered index defines the physical order, it means that your inserts and your deletes are happening at different places in the table. This should reduce the blocking that you have. You might even be able to get away with page locks for the delete (though I won't promise that). Try and see.
btw, nolock on a delete won't do much. Nolock only affects read locks, not exclusive locks
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
December 21, 2007 at 1:27 pm
Sorry Missed part of the table configuration.
CLUSTERED INDEX time_stamp, connid
INDEX connid, key1
I did a bit of testing and limited my purge to 6 records(we only have 6 records older than 120 days). It took over 5 minutes to run.
Set @keep_date = getdate() - 120
If Exists(select * from attachdatadetail where time_stamp < @keep_date)
Begin
Delete A
From
attachdatadetail A Join
(Select Top 100000 connid from attachdatadetail where time_stamp < @keep_date) B On
A.connid = B.connid
End
I am still testing doing smaller delete batches without using a tmp table.
December 21, 2007 at 8:01 pm
Troy Doherty (12/21/2007)
By deleting records < select @dt = datediff( d, -94, getdate()) i would have deleted all of my data.....:w00t:
Heh... Troy, you must have a death wish... testing deletes on real data. Good way to end up in a soup line literally overnight.
Let's make some test data that no one will care about if you delete it, eh? Here's the code I use to make test data with a couple of changes in it to suit the current situation...
--===== If the test table exists, drop it
IF OBJECT_ID('dbo.JBMTestDetail','U') IS NOT NULL
DROP TABLE dbo.JBMTestDetail
GO
--===== Create and populate a 1,000,000 row test table.
-- Column "ConnID" has a range of 1 to 100,000 non-unique numbers stored as VARCHAR(30)
-- Column "Key1" has a range of 1 to 100,000 non-unique numbers stored as VARCHAR(30)
-- Column "SomeMoney has a range of 0.0000 to 99.9999 non-unique numbers
-- Column "Time_Stamp" has a range of >=01/01/2005 and <01/01/2015 non-unique date/times
-- Column "SomeCSV" contains 'Part01,Part02,Part03,Part04,Part05,Part06,Part07,Part08,Part09,Part10'
-- for all rows.
-- Column "SomeHex12" contains 12 random hex characters (ie, 0-9,A-F)
SELECT TOP 1000000
ConnID = ISNULL(CAST(ABS(CHECKSUM(NEWID()))%100000+1 AS VARCHAR(30)),''), --(10 rows per connection)
Key1 = ISNULL(CAST(ABS(CHECKSUM(NEWID()))%100000+1 AS VARCHAR(20)),''), --just to test index with
SomeLetters2 = CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65),
SomeCSV = CAST('Part01,Part02,Part03,Part04,Part05,Part06,Part07,Part08,Part09,Part10' AS VARCHAR(80)),
SomeMoney = CAST(ABS(CHECKSUM(NEWID()))%10000 /100.0 AS MONEY),
Time_Stamp = ISNULL(CAST(RAND(CHECKSUM(NEWID()))*3652.0+38351.0 AS DATETIME),0),
SomeHex12 = RIGHT(NEWID(),12)
INTO dbo.JBMTestDetail
FROM Master.dbo.SysColumns t1,
Master.dbo.SysColumns t2 --Lack of join criteria makes this a CROSS-JOIN
--===== Create indexes similar to Troy's
CREATE CLUSTERED INDEX IXC_JBMTestDetail_Time_Stamp_ConnID ON dbo.JBMTestDetail (Time_Stamp,ConnID)
CREATE NONCLUSTERED INDEX IX_JBMTestDetail_ConnID_Key1 ON dbo.JBMTestDetail (ConnID,Key1)
Ok... now, like I've been saying... if you're gonna do a lot of deletes while the system is trying to do inserts, you have to give the system some time to do the inserts. That takes a delay in the delete loop. Also, you're non-clustered index is going to be part of the "killer"... it takes time to also delete from the index. So, you need to cut down on the number of rows you're trying to delete at one time... you need to get the deletes done in the loop down to about 1 second... like this... (Read the comments!!! They're important!!!)...
--===== Setup to measure performance...
SET STATISTICS TIME ON
--===== Define the cutoff date with a time of "midnight" or, if you will,
-- define the cutoff date with no time so we only delete whole days.
DECLARE @CutoffDate DATETIME
SELECT @CutoffDate = DATEADD(dd,DATEDIFF(dd,0,GETDATE()),-94)
--===== Limit all further queries, including deletes, to 25,000 rows
-- (about 1 second worth of deletes, like I said before)
SET ROWCOUNT 25000
--===== See if any rows qualify for deletion. If even just one exists,
-- then there's work to do and @@ROWCOUNT will be > 0.
-- DO NOT PUT ANY CODE BETWEEN THIS SELECT AND THE WHILE LOOP OR
-- YOU'LL MESS UP THE ROWCOUNT FOR THE WHILE LOOP
SELECT TOP 1 1 FROM dbo.JBMTestDetail WHERE Time_Stamp < @CutoffDate
--===== If the rowcount from the above is greater than 0,
-- then delete 25,000 rows at a time until there's nothing
-- left to delete
WHILE @@ROWCOUNT > 0
BEGIN
--===== Just a "marker" to separate the loop in the output
PRINT REPLICATE('=',78)
--===== This delay gives other processes breathing room
WAITFOR DELAY '00:00:10'
--===== Do the delete. Will be limited by the SET ROWCOUNT above.
-- DO NOT PUT ANY CODE BETWEEN THIS DELETE AND THE "END" OR
-- YOU'LL MESS UP THE ROWCOUNT FOR THE WHILE LOOP.
DELETE dbo.JBMTestDetail WITH (TABLOCKX)
WHERE Time_Stamp < @CutoffDate
END
--===== Restore the ability to process more than 25,000 rows
SET ROWCOUNT 0
The code above deletes about 270,000 rows in about 2 1/4 minutes... of course it would delete a lot faster without the built in delay, but then that would cause the long winded blocking you were concerned about.
I call this type of code a "Delete Crawler"... it crawls through what it needs to delete and it doesn't matter how long it takes because it causes minimal blocking.
--Jeff Moden
Change is inevitable... Change for the better is not.
December 24, 2007 at 7:51 am
Once again thank you for all the great replies.
Jeff, I have a sandbox hung off od our DEV environment so not risk to real data. I have a simulator We built that actually inserts "fake" call data into the db at a rate of up to 100 records per sec. I am testing at 10RPS to mimic actual production traffic.
Below is the last version I was working with: I will review and test your suggestions.
Again, thank you for all the help
begin
declare @count int
declare @dt datetime
set @dt = getdate()-94
set @count=0
insert into delete_connids_route select connid from routedetail (tablock) where time_stamp < @dt
select @count=count(1) from delete_connids_route (tablock)
while @count>0
begin
select top 100000 connid into #connif1222 from delete_connids_route
begin transaction t1
delete from routedetail with (rowlock) where connid in
(select connid from #connif1222);
commit transaction t1
begin transaction t2
delete from delete_connids_route with (tablock)where connid in
(select connid from #connif1222);
commit transaction t2
drop table #connif1222
WAITFOR DELAY '00:00:05'
select @count=count(1) from delete_connids_route (tablock)
end
end
Troy Doherty
December 26, 2007 at 7:38 am
Jeff,
After extensive testing your solution is the best for our environment. Thank you very much for all the help.:)
Troy Doherty
December 26, 2007 at 10:23 am
Outstanding! Thank you so much for the feedback, Troy. Very happy to have been able to help.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 12 posts - 1 through 12 (of 12 total)
You must be logged in to reply to this topic. Login to reply
This website stores cookies on your computer.
These cookies are used to improve your website experience and provide more personalized services to you, both on this website and through other media.
To find out more about the cookies we use, see our Privacy Policy