Question about ENORMOUS transaction log

  • Hello there,

    I have looked through my books and this forum as well as some others trying before bringing this up here.

    We are running the following code in Query Analyzer (SQL Server 2000 SP3):

    ALTER TABLE FINANCIAL_LINE_ITEM ADD CONSTRAINT PK_FINANCIAL_LINE_ITEM

    PRIMARY KEY CLUSTERED(ENTITY_ID,TRANSACTION_ID,LINE_ITEM_NUM,FORM_SEQ_NUM)

    WITH FILLFACTOR = 80 ON [FIN_IDX];

    So, basically adding a clustered index on a table.

    This table is huge. Looking in taskpad, I see the following:

    Rows: 937 million

    Table size: 334 GB

    Index size (before the operation above): 49 GB

    We have plenty of room (a SAN), nice machines, and we believe we can complete this operation with no problem, except for one thing: the transaction log keeps filling up:

    Server: Msg 9002, Level 17, State 6, Line 1

    The log file for database 'DTAX' is full. Back up the transaction log for the database to free up some log space.

    Server: Msg 1750, Level 16, State 1, Line 1

    Could not create constraint. See previous errors.

    The statement has been terminated.

    Our recovery model is FULL, and we back off the transaction log every 30 minutes. However, this single transaction fills the entire log in one go. Currently, the log file is allocated at about 160 GB. So, the log file is already (at least) half the size of the table. The index is on four fields out of 18 total.

    Questions:

    1. Is is reasonable for an index creation to take more than half the size of the whole table to log? Of course, we don't even know the upper bound; for all I know, this operation needs half a terabyte to log. How can I ballpark the log size for this?

    2. I doubt it, but is there any way to minimize the logging? I have read that the BULK recovery model does "minimal" logging of the CREATE INDEX operation. Does that apply to adding a primary key constraint as well?

    Thanks in advance for your help,

    Chris

  • I would perform a full backup before I started this to insure a restore method.

    I suggest setting the database to Simple mode, setting the log to auto grow in a large increment, creating the index, performing checkpoint, truncating the log, shrink the log back to 160GB, setting the database to Full mode, and performing a complete database backup.

    The Clustered index is reordering all 937 million rows which will create a huge log, at least the size of the table + the size of the index.

    Checkpoint forces the truncate log in Simple mode, which will be large as well due to the single batch. Truncate Log is to free any remaining allocated space in the log, insures that the shrink operation can restore the original log file size.

    Then shrink file back to your normal 160GB log file size before changing back to Full mode, which requires a full backup to resume the ability for a log restoration and is a good idea anyway.

    I question the 80 fill factor, but then I suspect that you know the table growth and key dispersion well enough to choose this value.

    Andy

  • It's been a while since I rebuilt/built a big index but the general rule of thumb is that it will take double the size of the table to build the clustered index on it.  There is an option to sort in tempdb.  But it will take that amount of space also.  The plus side of that is you can cleanup tempdb rather easily.

    CREATE [ UNIQUE ] [ CLUSTERED | NONCLUSTERED ] INDEX index_name

        ON { table | view } ( column [ ASC | DESC ] [ ,...n ] )

    [ WITH < index_option > [ ,...n] ]

    [ ON filegroup ]

    < index_option > :: =

        { PAD_INDEX |

            FILLFACTOR = fillfactor |

            IGNORE_DUP_KEY |

            DROP_EXISTING |

        STATISTICS_NORECOMPUTE |

        SORT_IN_TEMPDB 

    }

    Tom

     

  • I found this in bol.

    Because the leaf level of a clustered index and its data pages are the same by definition, creating a clustered index and using the ON filegroup clause effectively moves a table from the file on which the table was created to the new filegroup. Before creating tables or indexes on specific filegroups, verify which filegroups are available and that they have enough empty space for the index. It is important that the filegroup have at least 1.2 times the space required for the entire table.

    Tom

     

  • Simple mode won't help unless you are processing other work. I believe this runs as one transaction, which means it wouldn't get logged until it was completed.

  • Check that, it won't get committed until completed. So the entire thing will be noted in the transaction log, even in simple mode and not cleared until it completes.

  • Your existing table is 334 GB. Using the 'times 2' rule you are going to need 668 GB of database space to rebuild the index and at least that much in transaction log space. Since you are on a SAN, if space is available, then you're OK. If you decide to perform this by just adding space I'd go to 'simple' recovery and make your transaction log allocation 700 Gb. However, if the database (or just your table can be taken offline without illl affects) I'd suggest a rebuild of the table. First bcp out the data in 100 million row files (you can probably do this concurrently to save time). Drop and recreate the table. Then bcp in the data with a batch size of 1,000,000. This will give you smaller units of work (transactions) and hopefully keep your transaction log managable. In the past I've found that this method is quicker although it has more steps and intervention.

    RegardsRudy KomacsarSenior Database Administrator"Ave Caesar! - Morituri te salutamus."

Viewing 7 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply