ALTER TABLE statement took down database - HELP please to stop it happening in future

  • DISCLAIMER:  I'm a dev rather than a DBA.

    We use Log4Net to capture database interactions - lots of INFO messages, a few WARNINGs and, crucially, ERRORs.  Recently our website was misbehaving and in an effort to diagnose it I looked at the Log4NetLog table.(there are currently 17756271 rows in the table.)

    I realised that there was a problem with the design of the table - the useful columns [Message] and [Exception] are nvarchar(4000) and nvarchar(2000) respectively.  Our code was trying to write the Exception thrown by the application but because the column wasn't wide enough it was truncating the stack trace.

    So I asked our data team to run the following code

    ALTER TABLE [dbo].[Log4NetLog]
    ALTER COLUMN [Message] nvarchar(MAX) NOT NULL;
    ALTER TABLE [dbo].[Log4NetLog]
    ALTER COLUMN [Exception] nvarchar(MAX) NOT NULL;
    ;

    Unfortunately, because of the huge amount of data in the table this essentially took the database off-line, and thus killed the website.

    I have two questions, if I may:

    1.  In the event of someone attempting to execute such a query in future, what is the best way to kill this running query?  The database was out of action for half an hour - could this have been speeded up?
    2.  It would be really useful to be able to apply the DMO script above.  To this end we plan a) to schedule it during the night (the website is only used in this country and has virtually no traffic at night) and b) we want to remove old records - say, any record older than 1 year.  However, my instinct is that if we were to try to run a query like

    DELETE
    FROM
        Log4NetLog
    WHERE
        ([Date] < '27 September 2017')

    that this would similarly crash the database.  Are there any strategies that might be applied to remove huge numbers of superfluous records?

    Many thanks

    Edward

  • Edward

    With 18 million rows, and not narrow ones at that, you'd have to expect it to take a long time.  You could try creating a new table and moving the data into it instead - but that would be more to prevent excessive transaction log growth than to save time.

    John

  • John Mitchell-245523 - Thursday, September 27, 2018 8:06 AM

    Edward

    With 18 million rows, and not narrow ones at that, you'd have to expect it to take a long time.  You could try creating a new table and moving the data into it instead - but that would be more to prevent excessive transaction log growth than to save time.

    John

    Thanks John.  I wondered if there were a pattern for easy trimming of unwanted data; an approach for doing this in easy stages.  Old data could be deleted without impacting our reporting capabilities.

  • Edward

    Yes, you can delete in batches.  Just delete TOP 10000 (or the number of your choice) over and over until there are no more rows that you want to delete.  You may want to allow a few seconds and/or run a CHECKPOINT and/or do a transaction log backup after each individual delete, in order to check log growth.

    As for killing the running query, possibly not.  Killing the query rolls back the transaction, which may take even longer than allowing it to finish.

    John

  • John Mitchell-245523 - Thursday, September 27, 2018 8:19 AM

    Edward

    Yes, you can delete in batches.  Just delete TOP 10000 (or the number of your choice) over and over until there are no more rows that you want to delete.  You may want to allow a few seconds and/or run a CHECKPOINT and/or do a transaction log backup after each individual delete, in order to check log growth.

    As for killing the running query, possibly not.  Killing the query rolls back the transaction, which may take even longer than allowing it to finish.

    John

    John

    I discussed with others in myteam about the deleting in batches scenario, but I assumed that the query to a) identify rows to be deleted and b) deleting them might have negative effects on DB performance.  However, I'll have a look and see if I can get this to work.

    As for killing the query - thanks for the information.  We'll know next time!

    Edward

  • Edward

    That depends on your indexing.  A log table is the perfect candidate to have its clustered index on the Date column.  If that's what you have then DELETE TOP (10000) FROM Log4NetLog WHERE Date < '20170927' will be real quick.

    John

  • John Mitchell-245523 - Thursday, September 27, 2018 8:38 AM

    Edward

    That depends on your indexing.  A log table is the perfect candidate to have its clustered index on the Date column.  If that's what you have then DELETE TOP (10000) FROM Log4NetLog WHERE Date < '20170927' will be real quick.

    John

    Thanks John.

  • Here's an idea.  
    Create a view that selects the data you want to keep.  
    BCP out the data from the view. 
    Drop, or truncate the table. 
    BCP the data back in.

    Michael L John
    If you assassinate a DBA, would you pull a trigger?
    To properly post on a forum:
    http://www.sqlservercentral.com/articles/61537/

  • Michael L John - Thursday, September 27, 2018 9:16 AM

    Here's an idea.  
    Create a view that selects the data you want to keep.  
    BCP out the data from the view. 
    Drop, or truncate the table. 
    BCP the data back in.

    Thank you - an interesting suggestion!

  • edwardwill - Thursday, September 27, 2018 9:20 AM

    Michael L John - Thursday, September 27, 2018 9:16 AM

    Here's an idea.  
    Create a view that selects the data you want to keep.  
    BCP out the data from the view. 
    Drop, or truncate the table. 
    BCP the data back in.

    Thank you - an interesting suggestion!

    We have just completed a similar situation to purge old data for GDPR.  
    As an example, the most recent was a table containing ~57 million records.  We had to remove 7 million records.  
    This method took just under 30 minutes.  

    More details:
    1. SELECT * into Table_OLD from Table
    2. Create the view
    3. BCP out the data.  That took 6 minutes
    4. Drop the table
    5. Create the table, leaving off the keys and indexes
    6. BCP in the data.  That took 11 minutes
    7. Re-create the PK, indexes, and foreign keys. That took 18 minutes

    BCP in will be slower than out.  Use native mode, not character mode. 

    Here's another question.  Do you need the MAX data types?

    Michael L John
    If you assassinate a DBA, would you pull a trigger?
    To properly post on a forum:
    http://www.sqlservercentral.com/articles/61537/

  • edwardwill - Thursday, September 27, 2018 8:28 AM

    John Mitchell-245523 - Thursday, September 27, 2018 8:19 AM

    Edward

    Yes, you can delete in batches.  Just delete TOP 10000 (or the number of your choice) over and over until there are no more rows that you want to delete.  You may want to allow a few seconds and/or run a CHECKPOINT and/or do a transaction log backup after each individual delete, in order to check log growth.

    As for killing the running query, possibly not.  Killing the query rolls back the transaction, which may take even longer than allowing it to finish.

    John

    John

    I discussed with others in myteam about the deleting in batches scenario, but I assumed that the query to a) identify rows to be deleted and b) deleting them might have negative effects on DB performance.  However, I'll have a look and see if I can get this to work.

    As for killing the query - thanks for the information.  We'll know next time!

    Edward

    Trying to delete a large number of records in one go is going to have, potentially, a rather more serious impact as, after you impact on a certain number of locks for the delete the engine will lock the table - it's c. 5000 locks.  Otherwise the server would become resource starved and eventually fall over (look up lock escalation).  Batching will - if done in batches small enough, I often use 3000 - stop the table being locked

    I'm a DBA.
    I'm not paid to solve problems. I'm paid to prevent them.

  • edwardwill - Thursday, September 27, 2018 7:00 AM

    DISCLAIMER:  I'm a dev rather than a DBA.

    We use Log4Net to capture database interactions - lots of INFO messages, a few WARNINGs and, crucially, ERRORs.  Recently our website was misbehaving and in an effort to diagnose it I looked at the Log4NetLog table.(there are currently 17756271 rows in the table.)

    I realised that there was a problem with the design of the table - the useful columns [Message] and [Exception] are nvarchar(4000) and nvarchar(2000) respectively.  Our code was trying to write the Exception thrown by the application but because the column wasn't wide enough it was truncating the stack trace.

    So I asked our data team to run the following code

    ALTER TABLE [dbo].[Log4NetLog]
    ALTER COLUMN [Message] nvarchar(MAX) NOT NULL;
    ALTER TABLE [dbo].[Log4NetLog]
    ALTER COLUMN [Exception] nvarchar(MAX) NOT NULL;
    ;

    Unfortunately, because of the huge amount of data in the table this essentially took the database off-line, and thus killed the website.

    I have two questions, if I may:

    1.  In the event of someone attempting to execute such a query in future, what is the best way to kill this running query?  The database was out of action for half an hour - could this have been speeded up?
    2.  It would be really useful to be able to apply the DMO script above.  To this end we plan a) to schedule it during the night (the website is only used in this country and has virtually no traffic at night) and b) we want to remove old records - say, any record older than 1 year.  However, my instinct is that if we were to try to run a query like

    DELETE
    FROM
        Log4NetLog
    WHERE
        ([Date] < '27 September 2017')

    that this would similarly crash the database.  Are there any strategies that might be applied to remove huge numbers of superfluous records?

    Many thanks

    Edward

    To avoid the issue when altering the columns - you could just create new columns.  Adding the new columns would be very quick since there is no validation that needs to be done - you would just need to update the code to use the new columns or you could create a view that checks each column for null and returns the non-null value (e.g. coalesce(MessageMax, Message) AS Message, coalesce(ExceptionMax, Exception) AS Exception).

    Or - since this is just a logging table you could create a new table with the appropriate data types with a new name.  Then - at a low time rename the existing table to an archive name and the new table to the original name.  The rename should be very quick and once completed your system will start logging to the new table.

    If you did that - then deleting from the archive table would have zero impact on your application since that table is no longer being utilized.  I would still recommend batching the deletes so you don't negatively impact the transaction log.

    And one more item - if you can setup the new table with an appropriate partition scheme - you could eliminate any future issues of archiving/purging data by switching out the old partitions and creating new partitions.  You could setup the partitions on a monthly basis...

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • If there aren't any foreign keys (and logging tables like this rarely use them) just sp_rename the existing table to something else and recreate a new one. That way you have all the existing data in an offline table should you want to keep any of it and your system can go about it's merry way. You can always just perform a delete on the offline copy to clear out stuff you don't need.

    Going forward, have a regular process in place to purge/archive unneeded records before it all gets to large - either by just deleting stuff or by partitioning the table and using partition switching to move data out of the table.

  • Thank you everyone for all your helpful input.  I believe that our team has a plan, now (and for Blackadder fans, it's a cunning one).

    Best wishes

    Edward

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply