indexes on staging tables

  • Hi All,

    Scenario 1:

    Staging tables are truncated on a daily basis and refilled with data. These tables are then used to update live tables. Indexes have been created on staging tables initially.

    Scenario 2:

    Before inserting data into staging tables, indexes are dropped,

    after inserting data, indexes are again created on staging tables to facilitate data updates and inserts into live tables.

    P.S. indexes are all non-clustered.

    Will there be any performance gain in scenario 2 ?

    Thanks.

  • I think that it depends on a number of different things, starting with exactly how the tables are re-filled. In any event, it's always best to test it to be sure.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • I assume, after truncating the stage tables you are refilling them in one shot. In that case there will be performance gain in scenario 2.

    Two reasons:

    1. Indexes need not be updated during inserts

    2. Fragmentation will minimal as indexes are created afresh

  • Fazalul Haque (2/17/2010)


    I assume, after truncating the stage tables you are refilling them in one shot. In that case there will be performance gain in scenario 2.

    Two reasons:

    1. Indexes need not be updated during inserts

    2. Fragmentation will minimal as indexes are created afresh

    Actually I would suspect it to be the other way around, and for just these reasons. Consider that "Loading in one shot" often means BULK INSERT or BCP, which I believe are optimized for this.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • thanks...

    i just checked it out.. without indexes, update query takes 30 sec.

    with the indexes in places, it takes 2 min !

    RBarryYoung (2/17/2010)


    I think that it depends on a number of different things, starting with exactly how the tables are re-filled. In any event, it's always best to test it to be sure.

  • ziangij (2/19/2010)


    thanks...

    i just checked it out.. without indexes, update query takes 30 sec.

    with the indexes in places, it takes 2 min !

    RBarryYoung (2/17/2010)


    I think that it depends on a number of different things, starting with exactly how the tables are re-filled. In any event, it's always best to test it to be sure.

    Yes, but did you count the time to drop and re-create the indexes too?

    Also, just for our information, what are you using to load your table(s): INSERT, BULK INSERT, SSIS, DTS, program with SqlBulkCopy, or BCP?

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • ziangij (2/19/2010)


    thanks...

    i just checked it out.. without indexes, update query takes 30 sec.

    with the indexes in places, it takes 2 min !

    RBarryYoung (2/17/2010)


    I think that it depends on a number of different things, starting with exactly how the tables are re-filled. In any event, it's always best to test it to be sure.

    Drop Indexes then load data then create indexes after data load, that way fragmentation of table will not takes place.

    EnjoY!

    EnjoY!
  • GT-897544 (2/20/2010)


    ziangij (2/19/2010)


    thanks...

    i just checked it out.. without indexes, update query takes 30 sec.

    with the indexes in places, it takes 2 min !

    RBarryYoung (2/17/2010)


    I think that it depends on a number of different things, starting with exactly how the tables are re-filled. In any event, it's always best to test it to be sure.

    Drop Indexes then load data then create indexes after data load, that way fragmentation of table will not takes place.

    EnjoY!

    Follow Barry's advice - test. Not just the staging-table load, test the extract from the staging tables too.


    [font="Arial"]Low-hanging fruit picker and defender of the moggies[/font]

    For better assistance in answering your questions, please read this[/url].


    Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

    Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

  • Agree with ChrisM. Barry has good advice, but be sure you test extraction as well.

  • GT-897544 (2/20/2010)


    ziangij (2/19/2010)


    thanks...

    i just checked it out.. without indexes, update query takes 30 sec.

    with the indexes in places, it takes 2 min !

    RBarryYoung (2/17/2010)


    I think that it depends on a number of different things, starting with exactly how the tables are re-filled. In any event, it's always best to test it to be sure.

    Drop Indexes then load data then create indexes after data load, that way fragmentation of table will not takes place.

    EnjoY!

    Fragmentation of table will not take place only if you happen to drop the clustered index as well (which is often not recommended unless you are inserting a large percentage of the total table). Dropping/re-creating the clustered index is an entirely different exercise (that usually entails rebuilding the entire table, unless you happen to have loaded the data in clustered index order, and if you did - why both dropping the CI?). You would rally want to test that one before setting something like this up to happen on a regular basis.

    Dropping the non-clustered indexes and recreating them after the load will make sure that the indexes aren't fragmented.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • apologies... i should have included table structure and some sample data...

    declare @wareshousetable table

    (

    col1 [varchar](64) NULL,

    datesection [char](16) NULL,

    totalvalue [numeric](10, 0) NULL,

    usedvalue [numeric](10, 0) NULL,

    smalldatesection char(7) NULL

    )

    insert into @wareshousetable (col1,datesection,totalvalue,usedvalue)

    select 'server1','220003722000',100,25 union all

    select 'server1','220003722000',100,25 union all

    select 'server1','220003722003',100,35 union all

    select 'server1','220003722004',100,55 union all

    select 'server1','220003722006',100,75 union all

    select 'server1','220003722000',100,45 union all

    select 'server1','220003722020',100,25

    select * from @wareshousetable

    i can't have any clustered index on the table as it has many duplicate records...

    1. total count per day - 5 Million or more

    2. DTS used to insert data into staging tables

    3. using a stored proc, data is cleansed and fed into live tables...

    4. when i say update takes time, i meant updating smalldatesection column (without indexes, it takes 30 seconds, with indexes it takes 2 min.) I can't use datesection as it is, hence it need to modify it and then use the data.

  • Good to know, thanks for the follow-up!

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply