staging table and indexes

  • Hi there,

    I'd like to ask for some advice.

    I have a Main table that is used for reporting. Every 30 minutes a staging table is updated with the latest data and the data is modified at this stage via an SP. This allows my main table to be fully available during this time

    The data is now ready to be imported to the main table for updated reporting. there are about 10000 records and I know this is not alot. However, there is a UNION ALL on another table with over 200,000 records so speed is essential.

    I would like to ask what the best way is to keep the indexes working well with no fragmentation. Does it make a difference if I rebuild indexes on my stage table before the data is imported to the main table? or would I have to rebuild indexes on the main table when the data is imported? or will the main table be that effected if I don't worry about rebuilding or defraging the indexes?

    Thanks in advance for any help

  • The indexes will help on main table in reportting so rebuilding can be done after importing

    but why do you want to rebuild indexes after very import?

    And do you see any performance degrade on reporting side?

    pablavo (3/24/2010)


    I don't worry about rebuilding or defraging the indexes?

    and why you dont want to worry about rebuilding or defraging ?

    -------Bhuvnesh----------
    I work only to learn Sql Server...though my company pays me for getting their stuff done;-)

  • Hi Bhuvnesh, Thanks for responding

    you wrote:

    The indexes will help on main table in reportting so rebuilding can be done after importing

    but why do you want to rebuild indexes after very import?

    I should have added that the main table will be truncated before every import and the new data added to it. So you suggest that it's a good idea to rebuild after importing the data to the main table. Does your question ask me why I want to rebuild after every import? I don't know if I do, that's why I set up the thread... Since the table is truncated every 30 minutes and new data added, will the indexes have to be rebuilt or would the table be fine since it's truncated?

    You wrote:

    And do you see any performance degrade on reporting side?

    As far as the main table is concerned, that is truncated and has a data import every 30 minutes, I'm unsure if there will be performance degradation. the other table (that UNIONS the Main table) that is updated nightly will have it's indexes rebuilt and that only happens out-of-hours. it's important that the main table is available as much as possible during the day but also effecient.

    Do you think I will not need to rebuild after every import or at all?

    pablavo (3/24/2010)

    --------------------------------------------------------------------------------

    I don't worry about rebuilding or defraging the indexes?

    and why you dont want to worry about rebuilding or defraging ?

  • pablavo (3/24/2010)


    I should have added that the main table will be truncated before every import and the new data added to it.

    NOw that makes sense that rebuilding indexes with every import.

    So i think you should build indexes with fillfactor = 100 and then import the data. and why Fillfactor = 100 coz it will make sql server to traverse less data pages.

    -------Bhuvnesh----------
    I work only to learn Sql Server...though my company pays me for getting their stuff done;-)

  • pablavo (3/24/2010)


    I should have added that the main table will be truncated before every import and the new data added to it. So you suggest that it's a good idea to rebuild after importing the data to the main table. Does your question ask me why I want to rebuild after every import? I don't know if I do, that's why I set up the thread... Since the table is truncated every 30 minutes and new data added, will the indexes have to be rebuilt or would the table be fine since it's truncated?

    If you're talking about a non-clustered index, I think I would drop the index after truncating the table, then import the new data, then re-create the index. It would speed up the import (possibly considerably, depending on the index), and the index would be rebuilt as efficiently as SQL was able to.

    If it is a clustered index, keep the index in place during import, but see if you can order the incoming data in the same order as the clustered index would order it, so you'll be filling pages in sequence.

    Rob Schripsema
    Propack, Inc.

  • Interesting

    Rob Schripsema (3/25/2010)


    If you're talking about a non-clustered index, I think I would drop the index after truncating the table, then import the new data, then re-create the index.

    Can you explain, why Non-clustered index after import ?

    -------Bhuvnesh----------
    I work only to learn Sql Server...though my company pays me for getting their stuff done;-)

  • It's frequently quicker to create a non-clustered index with the data already in the table than to do a bulk insert and reindex.

  • Bhuvnesh (3/26/2010)


    Interesting

    Can you explain, why Non-clustered index after import ?

    As was just said, it's frequently quicker to let SQL index the entire table once the data is in it, building just the index pages, than to make it write a record to a data page and then turn around and have to find and update an index page as part of the insert.

    With a clustered index, on the other hand, the data is being written in the same set of pages as the index, so it would be more efficient (and this is an educated guess, I don't have benchmarks to verify this) to have the clustered index in place when inserting data. And if the data was being inserted in the same order it will be physically stored in the clustered index, so much the better.

    Maybe others can verify this, or correct my thinking....

    Rob Schripsema
    Propack, Inc.

  • I think that will depend largely on how closely the order of the inserted data matches the order of the clustered index. If it matches then SQL will be writing sequentially so there's no concern about there not being space where it wants to put the new row. If it doesn't match then there's the possibility that there won't be space where it needs to put it and fragmenting will happen. One thing that I don't have time to right now but would be interesting to is seeing as this is all happening in one insert whether or not SQL will order the rows in the order it needs to write them if no sort order is applied.

  • cfradenburg (3/26/2010)


    . . . One thing that I don't have time to right now but would be interesting to is seeing as this is all happening in one insert whether or not SQL will order the rows in the order it needs to write them if no sort order is applied.

    I believe that what will happen is that the data being imported will tend to be in the "order of entry" rather than SQL Server doing something smart and figuring out what ORDER BY clause should have been provided.

    In any case, if you supply the ORDER BY clause and it is not needed, then the worst it should do is occupy your fingers a bit longer as you type; however, if it is needed, it could significantly improve performance during and after the import.

    Ralph D. Wilson II
    Development DBA

    "Give me 6 hours to chop down a tree and I will spend the first 4 sharpening the ax."
    A. Lincoln

  • First, thanks folks for the great responses.

    Rob Schripsema wrote

    If you're talking about a non-clustered index, I think I would drop the index after truncating the table, then import the new data, then re-create the index. It would speed up the import

    I've heard this before so as far as a faster import. So just to clarify, dropping and recreating the index is a quicker process than reindexing the index that's already there?

    I'm presuming by the suggestions I've been given that my table will be fragmented if I'm truncating and importing new data without creating (or reindexing) the table after each import?

    Thanks again

  • repeat post mistake

  • I've heard this before so as far as a faster import. So just to clarify, dropping and recreating the index is a quicker process than reindexing the index that's already there?

    I'm presuming by the suggestions I've been given that my table will be fragmented if I'm truncating and importing new data without creating (or reindexing) the table after each import?

    If the index is already there, then each time a record is added to the table, SQL Server will also add the appropriate entry to the index. It won't know the quantity or nature of the data as it comes in, so there will be a lot of page splitting and adjustments to be done as this is happening.

    If you can drop the index before the import, SQL Server doesn't have to "worry" about updating an index as it writes the data records. Then, when you recreate the index once the data is in the table, SQL Server can sort the data (in RAM) according to the columns specified in your index prior to writing out the index pages, and then write them to disk more efficiently.

    I'm not certain that's exactly how SQL Server does it, but it makes sense that it would, and my experience has been that you do get better results (faster imports, less fragmented tables/indexes) if you do it this way.

    Rob Schripsema
    Propack, Inc.

  • Just to add to what Rob wrote, SQL updates ALL indexes on the table when a record is modified. For NCIs (nonclustered) , that can mean adding in new rows for inserts. For updates, it may or may not be a change.

    For clustered indexes, an update can result in a page split if the update doesn't fit on the page, or if a new/changed PK is there, then this calls for new space allocations and writes. If you insert in the middle of the index, possible movement of multiple rows.

    That means that dropping indexes and then doing inserts, and rebuilding the index, can be faster. I don't have a ton of experience with this, but I've seen it go both ways. As I've seen written

    "your mileage may vary. Test on your system"

  • Rob Schripsema (3/25/2010)


    pablavo (3/24/2010)


    I should have added that the main table will be truncated before every import and the new data added to it. So you suggest that it's a good idea to rebuild after importing the data to the main table. Does your question ask me why I want to rebuild after every import? I don't know if I do, that's why I set up the thread... Since the table is truncated every 30 minutes and new data added, will the indexes have to be rebuilt or would the table be fine since it's truncated?

    If you're talking about a non-clustered index, I think I would drop the index after truncating the table, then import the new data, then re-create the index. It would speed up the import (possibly considerably, depending on the index), and the index would be rebuilt as efficiently as SQL was able to.

    If it is a clustered index, keep the index in place during import, but see if you can order the incoming data in the same order as the clustered index would order it, so you'll be filling pages in sequence.

    Nope... you don't really need to do that if it's a single query that does the inserts into the truncated table.

    For the maximum speed, though, I'd maintain "two" main tables. One would be "online" and have a synonym pointed to it and the other one would be dropped. When it was time to rebuild the main table and provided that none of the data would come across a linked server, I'd use SELECT/INTO to build the new table and add the necessary indexes. Then, just repoint the synonym to the new table (which only takes a couple of milliseconds, BTW) and bingo... you're new updated table is online and there isn't a user that will ever notice.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 15 (of 29 total)

You must be logged in to reply to this topic. Login to reply