Create a non clustered index on a 400million rows

  • Does anybody have an idea about how long it takes to create a non clustered index on a large table. The table has about400M rows. It has about 17 columns and it has a clustered index on the identity field and non clustered index on a date field. I am not sure if it makes sense to create a temporary table and just DTS everything back in after creating the index. Let me know your thoughts.

  • Here's one, put the clustered index on the DATE. The CI on the identity column is pretty muc a waste in this case.

    You might also confirm that adding that CI increases the performance all-around before taking this step.

    As for creation time, a couple hours at least would be my guess (assuming the server isn't doing anything else... and plenty of ram and disk space and spindle available).

  • I wouldn't suggest you empty the table, create the index and then load the table. That's actually the opposite of how most large scale loads are done. Instead, get every bit of data into the table and then create the index on the data.

    You're moving up to the realm of data volumes where you might want to consider partitioning in order to get more disks involved in the process.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • The time will vary hugely depending on your configuration and the average data length of your rows, but I would figure on a couple of hours at a minimum. And probably a couple of days at the maximum.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • 400 Million rows? Consider partitioning that bad boy.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply