adding a computed persisted column to a large table and error out due to transaction log size

  • I have a 1.5TB size of table(3.8 Billion Records) and want to add a new computed persisted column to create the parition on the computed persisted column.

    When i executed it got failed after 6 days and with error as the transaction log size is full. In SQL 2008 we have limitation to 2TB and now we added 3 more transaction log file with each 2TB.

    Is it good to proceed with this change? Can you please suggest if you have different solution to make it faster.

  • What is your back up mode?

    Did you enclose a batch of changes within a start - end transaction?

    If everything seems to be going well, you have obviously overlooked something.

    Ron

    Please help us, help you -before posting a question please read[/url]
    Before posting a performance problem please read[/url]

  • Are you just adding the column or also adding it into the primary key?

  • I didnt apply the start and end transaction when it was failed. But now in the latest script added the begin and commit transaction in between alter table.

    Not sure if there are any other activity was running during failure but now all acounts are inactivated and only this creation running for smaller table.

    23 million table is completed in 20 mins but yet to start the 3.8 Billion record count table script. I am wondering whether there will be failure after 5 days due to trnsaction log size.

  • I am just adding the not null column and it was failed after 6 days

  • Selvaraju (6/20/2011)


    23 million table is completed in 20 mins but yet to start the 3.8 Billion record count table script. I am wondering whether there will be failure after 5 days due to trnsaction log size.

    Sorry, there is not enough information to say for sure.

    How is the table defined? Is it a heap or is it clustered? If clustered, what is the fill factor of the index?

    What is the data type of the new persisted computed column and is your calculation the same as the one you did on the 23 million row table?

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • A far out idea. If you have sufficient disc space. Create another new table exactly like the table to which you want to add the persisted column, with the persisted computed column in the new table.

    Set up a insert statement, inserting records from the existing table into the new table, say a few thousand or a hundred thousand in a batch, within a transaction. When the transaction completes perform a log backup. Delay for a minute of so after the backup to allow other activities to take place and not slow down the entire system excessively. Then repeat the batch .. keep doing that until all records are written to the new table with the persisted computed column. Then change the old table name, change the new table name to that of the old table ... and your done ... remember to transfer the last batch of records as quickly as possible, and change names so that no data is lost ... in fact during the name change maybe if you can do it in single user mode .. Of course test all of this in a NON PRODUCTION database before attempting on the production DB.

    If everything seems to be going well, you have obviously overlooked something.

    Ron

    Please help us, help you -before posting a question please read[/url]
    Before posting a performance problem please read[/url]

  • I didnt setup any specific definitation as heap or clustered or fill factor of the index. The data type for persisted computed column is INT.

    I will be posting the solution what i have followed to implement this huge table.

  • Always the compouted column addition takes N number of hours and utilize more transaction log file. As part of one table which had 2 billion records took 35 hours to add the comuted column in the existing data.

    For 3.7 billion record table, i have decided to follow the below steps and executed successfully.

    1. Created empty table with same existing table strcuture

    2. Added computed column in the empty table

    3. Loaded data from existing table to new table using SSIS package which commit every 10000 records.

    4. The SSIS pacakge also ran into 3 phase considerring 1.5 billion per phase based on the column filter

    5. The data load took around 60 hr's

    6. Renamed the actual table to old and renamed the new table as actual table.

    Thanks for all your time to share the information. Please let me know if you need more clarification.

  • Thanks for posting back. I bet you could have cut that time in half, at least, if you had used T-SQL (as bitbucket-25253 suggested) instead of SSIS.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • The reason for SSIS pacakge to track the status of the data insert and morover it's reliabled and you can track the log file if incase it's failure. The SSIS pacakge has internal commit for each 10,000 records. You may correct i should have save some time if i used T-SQL.

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply