Indexing large tables

  • I am merging several large tables (10M + rows) in MS Access into one large table in SQL Server 2000. The tables have no PK's or indexes. What is best way to implement indexes, PK (for improvement of query performance) on the new table of 100M rows?

    Thanks

  • The PK is the unique identifier for a row. Your data will tell you this one. It can be clustered or non-clustered, depending on how you query, though I lean towards NC.

    Other indexes are based on how you query the table. What columns you return, what's in your WHERE clauses, you have to look at what you often filter by in the WHERE clause to determine what to index.

  • Yes very good thanks for your reply.

    The problem is this: I have a table which is populated with 100 million rows. There is no indexes or PK on it. To add a PK or index now would take who knows how long to build due to the size of the table and might time out. Should I go ahead and do that anyway or is there a better solution?

    My other idea was to build a new table, put PK and indexes in place and then populate it.

    BTW the data is coming over from Access via Import/Export wizard on SSMS.

    Much appreciated.

  • If the table is that large, consider splitting it into smaller tables based on a PK and then build a partitioned view to put it all back together again. That way, you can get the whole thing ready behind the scenes and when you're done, just name the view the same as the original table. Obviously, you'd need to rename the original table as "_Old" or some such and you'd need to do a final update from the original table to make sure you had all of the data.

    There are articles in Books Online for how to setup such partitioning but it boils down to everything in a given "sub-table" meeting some common specific criteria (usually NOT the primary key but can be a range of PK's in a pinch) that is enforced by a constraint so the partitioned view quickly knows which table to not only read from, but which table to write to.

    If carefully planned, it can be made such that "old" data never needs to be reindexed thus lightening the over maintenance requirements.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 4 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply