Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

Diagnosing Page Latch Issue Expand / Collapse
Author
Message
Posted Sunday, March 10, 2013 6:49 PM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Friday, March 15, 2013 4:41 PM
Points: 17, Visits: 35
I'm working with both a Vendor's DB and a purge script they've provided, and I need some outside opinions.

This purge runs incredibly long, especially on our SAN. Our first thought, was because the purge was querying tables using non-indexed columns, especially several large tables the big one being 60m rows and 160GB.
We added indexes and we definitely saw the performance improvements you'd expect when just purging other tables. Additionally, after about an hour of the purge, I can query the big table being purged and according to SQL Server receive the expected post-purge counts/values.

The catch here is that 5-6 hours after the table "looks" purged, the job is continuing to alternate in pagiolatch waits. Checking the resource information confirms that it's the same table requiring the wait.

It doesn't appear to be a script issue, per se. But, does this suggest anything that I can give back to my SAN admins or the vendor?

Portion of the script (implicit transactions is off by default in this environment):
BEGIN
SELECT @table_name=TABLE_NAME,@duration=PURGE_DURATION FROM OLTPDB_PURGE WHERE TABLE_NAME = 'CHECKPOINTDATA';
SET @purge_date = CURRENT_TIMESTAMP-@duration;
DECLARE m_cursor CURSOR FOR SELECT ID FROM CHECKPOINTDATA WHERE CREATEDATE < @purge_date
OPEN m_cursor
FETCH NEXT FROM m_cursor INTO @str_id;
WHILE @@FETCH_STATUS = 0
BEGIN
DELETE FROM CHECKPOINTDATA WHERE ID = @str_id;
SET @record_cnt = @record_cnt + 1;
IF (@record_cnt >=1000)
BEGIN
COMMIT;
SET @record_cnt = 0;
END
FETCH NEXT FROM m_cursor INTO @str_id;
END
CLOSE m_cursor
DEALLOCATE m_cursor
END
Post #1429046
Posted Monday, March 11, 2013 5:27 AM
SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Monday, June 9, 2014 6:02 AM
Points: 2,674, Visits: 697
well it sounds like an io wait .. use the function sys.fn_virtualfilestats to see what's going on initially.
the partial script you've pasted doesn't look like it's written for sql server - commit without begin tran won't do very much one way or another.
you might want to check for open transactions - that might cause issues.


The GrumpyOldDBA
www.grumpyolddba.co.uk
http://sqlblogcasts.com/blogs/grumpyolddba/
Post #1429181
Posted Tuesday, March 12, 2013 8:14 AM


SSCarpal Tunnel

SSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal Tunnel

Group: General Forum Members
Last Login: Today @ 9:29 AM
Points: 4,436, Visits: 6,337
Have them stop deleting data one row at a time!!! Using a cursor for this is ridiculous! This should be done in batches of X records at a time using an index seek to get to the proper ranges of batches to delete. I have done this type of stuff at many clients over the years and it is both incredibly efficient AND avoids locking large ranges of data.

Oh, and there isn't a hint of error handling in that code!


Best,

Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru at GMail
Post #1429840
Posted Wednesday, March 13, 2013 8:56 AM


Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Tuesday, October 28, 2014 12:50 PM
Points: 1,061, Visits: 2,580
brownph99 (3/10/2013)
I'm working with both a Vendor's DB and a purge script they've provided, and I need some outside opinions.

This purge runs incredibly long, especially on our SAN. Our first thought, was because the purge was querying tables using non-indexed columns, especially several large tables the big one being 60m rows and 160GB.
We added indexes and we definitely saw the performance improvements you'd expect when just purging other tables. Additionally, after about an hour of the purge, I can query the big table being purged and according to SQL Server receive the expected post-purge counts/values.

The catch here is that 5-6 hours after the table "looks" purged, the job is continuing to alternate in pagiolatch waits. Checking the resource information confirms that it's the same table requiring the wait.

It doesn't appear to be a script issue, per se. But, does this suggest anything that I can give back to my SAN admins or the vendor?

Portion of the script (implicit transactions is off by default in this environment):
BEGIN
SELECT @table_name=TABLE_NAME,@duration=PURGE_DURATION FROM OLTPDB_PURGE WHERE TABLE_NAME = 'CHECKPOINTDATA';
SET @purge_date = CURRENT_TIMESTAMP-@duration;
DECLARE m_cursor CURSOR FOR SELECT ID FROM CHECKPOINTDATA WHERE CREATEDATE < @purge_date
OPEN m_cursor
FETCH NEXT FROM m_cursor INTO @str_id;
WHILE @@FETCH_STATUS = 0
BEGIN
DELETE FROM CHECKPOINTDATA WHERE ID = @str_id;
SET @record_cnt = @record_cnt + 1;
IF (@record_cnt >=1000)
BEGIN
COMMIT;
SET @record_cnt = 0;
END
FETCH NEXT FROM m_cursor INTO @str_id;
END
CLOSE m_cursor
DEALLOCATE m_cursor
END


What's going on in the rest of the script? Are you certain the waits are related to this portion of the code?

As was previously stated, this is a majorly inefficient way of deleting multiple rows from a table. I would question why your vendor is sending you this script - it suggests a real lack of depth of understanding of T-SQL, SQL Server, and set-based operations.


Jason Wolfkill
Blog: SQLSouth
Twitter: @SQLSouth
Post #1430447
Posted Wednesday, March 13, 2013 1:35 PM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Friday, March 15, 2013 4:41 PM
Points: 17, Visits: 35
wolfkillj (3/13/2013)

What's going on in the rest of the script? Are you certain the waits are related to this portion of the code?

As was previously stated, this is a majorly inefficient way of deleting multiple rows from a table. I would question why your vendor is sending you this script - it suggests a real lack of depth of understanding of T-SQL, SQL Server, and set-based operations.


The rest of the script is essentially a repetition of the block I sent, with some exceptions where additional child-table records are deleted. I watched the page resource waits, and getting the info via DBCC page(), the pages are on the same table that's been purged.

As to the vendor logic, I'm in full agreement there. At the minimum I'm looking at turning IMPLICIT_TRANSACTIONS on before executing the script.
Post #1430602
Posted Thursday, March 14, 2013 9:16 AM


Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Tuesday, October 28, 2014 12:50 PM
Points: 1,061, Visits: 2,580
colin.Leversuch-Roberts (3/11/2013)
well it sounds like an io wait .. use the function sys.fn_virtualfilestats to see what's going on initially.
the partial script you've pasted doesn't look like it's written for sql server - commit without begin tran won't do very much one way or another.
you might want to check for open transactions - that might cause issues.


Actually, with SET IMPLICIT_TRANSACTIONS ON, the DELETE would implicitly open a transaction that must then be either committed or rolled back by the code.

On my SQL Server 2008R2 instance, with SET IMPLICIT_TRANSACTIONS OFF, a COMMIT without a BEGIN TRANSACTION throws this error:

Msg 3902, Level 16, State 1, Line 10
The COMMIT TRANSACTION request has no corresponding BEGIN TRANSACTION.


But of course the statements preceding the COMMIT are auto-committed when they complete, so the results are not affected.

@brownph99, are you sure this code runs in with IMPLICIT_TRANSACTIONS OFF (if you have SET ANSI_DEFAULTS ON, then IMPLICIT_TRANSACTIONS will be ON also)? If so, does your code have a BEGIN TRANSACTION statement somewhere above the snippet you posted? If not, do you see the error message above? If you don't see the error message, that suggests that the code actually is running on a connection with SET IMPLICIT_TRANSACTIONS ON.

If you are indeed running with SET IMPLICIT_TRANSACTIONS ON, I think I see the cause of your problem. If the code that populates your cursor returns 1125 rows, it loops 1000 times to delete 1000 rows, then the IF condition evaluates to TRUE and the code block with the COMMIT runs. Then, it loops through 125 times to delete 125 rows, but the IF condition never evaluates to TRUE, so the COMMIT never runs and you end up with an open transaction that is not committed or rolled back.

Even if you are running with SET IMPLICIT_TRANSACTIONS OFF and have opened a transaction somewhere above this code block, that last loop of <1000 rows coming out of your cursor will not invoke the COMMIT and the transaction will remain open.

So, do you have open transactions on this connection after the table "looks" purged?


Jason Wolfkill
Blog: SQLSouth
Twitter: @SQLSouth
Post #1431052
Posted Thursday, March 14, 2013 4:16 PM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Friday, March 15, 2013 4:41 PM
Points: 17, Visits: 35
wolfkillj (3/14/2013)
Actually, with SET IMPLICIT_TRANSACTIONS ON, the DELETE would implicitly open a transaction that must then be either committed or rolled back by the code.

On my SQL Server 2008R2 instance, with SET IMPLICIT_TRANSACTIONS OFF, a COMMIT without a BEGIN TRANSACTION throws this error:

Msg 3902, Level 16, State 1, Line 10
The COMMIT TRANSACTION request has no corresponding BEGIN TRANSACTION.


But of course the statements preceding the COMMIT are auto-committed when they complete, so the results are not affected.

@brownph99, are you sure this code runs in with IMPLICIT_TRANSACTIONS OFF (if you have SET ANSI_DEFAULTS ON, then IMPLICIT_TRANSACTIONS will be ON also)? If so, does your code have a BEGIN TRANSACTION statement somewhere above the snippet you posted? If not, do you see the error message above? If you don't see the error message, that suggests that the code actually is running on a connection with SET IMPLICIT_TRANSACTIONS ON.

If you are indeed running with SET IMPLICIT_TRANSACTIONS ON, I think I see the cause of your problem. If the code that populates your cursor returns 1125 rows, it loops 1000 times to delete 1000 rows, then the IF condition evaluates to TRUE and the code block with the COMMIT runs. Then, it loops through 125 times to delete 125 rows, but the IF condition never evaluates to TRUE, so the COMMIT never runs and you end up with an open transaction that is not committed or rolled back.

Even if you are running with SET IMPLICIT_TRANSACTIONS OFF and have opened a transaction somewhere above this code block, that last loop of <1000 rows coming out of your cursor will not invoke the COMMIT and the transaction will remain open.

So, do you have open transactions on this connection after the table "looks" purged?


Right. With IMPLICIT_TRANSACTIONS ON, a transaction would be started and need to be explicitly committed. The difference in this case is instead of 1000 transactions when OFF, it's 1 transaction w/ 1k deletes.

I haven't actually seen the compile-time error you described, but I did confirm that it does need IMPLICIT_TRANSACTIONS for the COMMIT to be worth anything.

The problem with the <1000 rows is an actual bug in the script that I pointed out to the Vendor over a month ago, but it apparently fell on deaf ears.
Post #1431290
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse