Index creation - Before OR AFTER Data?

  • Hello,

    I have an monthly update mechanism that builds a database structure from scratch and populates it from five restored databases i.e. all DDL is created via multiple scripts for schema creation etc and then pulls in the necessary data from the source databases.

    At the moment the constraints have been built as part of the DDL to ensure the data is consistent which will have created indexes in certain situations (primary key).

    I am now looking at creating non-clustered indexes on foreign keys. Should I create these indexes during the schema creation or after the populations scripts?

    Thanks

  • >>Should I create these indexes during the schema creation or after the populations scripts?

    Create indexes after pupulating the data.

    Benefits:

    1) Data population will be faster as there is no index update.

    2) New index will be free from fragmentation.

  • If you need the indexes to accomplish the data population efficiently, then create them before. Otherwise, after.

    For example, if one of the tables you need to populate depends on data in one of the other tables you are populating, there might be an index that makes that more efficient. In that case, creating that index might be a good idea.

    But for bulk-loading tables, creating the indexes afterwards is definitely faster and better.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Thanks both for the information.

    I have decided to leave the primary constraints and unique constraints which by definition create indexes on the table creation statements so that the data is consistent. I will then insert the new data (these are scripts that insert into() select... from etc etc the five source databases) and afterwards re-build the indexes on all tables using the ALTER INDEX ALL ON statement.

    I will then apply the non-clustered indexes for the foreign key fields and run sp_updatestats so that the Query optimizer work more effectively.

    Does this seem like a good way of ensuring good performance from the indexes?

    As a note, this will purely be a READ_ONLY database and therefore I have left the fillfactor at 100%

  • Create index after populating database surely will improve the performance of populating database. Your actions are good.

  • Agree with above. Do it after.

  • Agreed as long as we're only talking about non-clustered indexes.

    If we were talking clustered indexes, it becomes a "it depends". Assuming the data files were roughly in the clustered index order it would make sense to create the clustered key FIRST, and then load. If the file order is vastly different from the clustered index, it is usually faster to add it in after the initial load.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply