Keeping at table lean and highly available

  • Hi folks,

    I have a question in regards to a design I've been pondering. Perhaps i'm merely over analyzing the problem but I wouldn't mind some input.

    I have a raw table that is used to collect large amounts of statistical data. Right now a web application hits this table directly and of course it is inefficiant and slow. I'm trying to redesign to process by having a job that parses through and summarizes the data and outputs it to a staging table before then adding it to a small table that will be exposed to the web application. I have this working great and seeing huge performance gains in a test environment.

    My concern (and it might not be warranted) is with the updating of the final table. It needs to be refreshed every hour with the new statistics. Keep in mind as well that the web application will sometimes query the table thousands of times in a span of a few minutes. My thoughts was to have a bit field let's say "isNew" in the destination table that is defaulted to true as new data comes in. (The calling stored procedure always grabbing data from rows where "isNew" is false.) Then I delete "old" rows then change all rows makred as new to old. Making them available once again and ready for the next data load.

    The resulting table should never grow beyond 10,000 rows thus the point of trying to keep it small, accessible and fast. Does anybody see this solution causing issues with a query returning no data in that small timeframe when I do the DELETE/ALTER or causing any kind of significant delay or wait to the application. Is there a beter way? Sorry about the verbage folks and any comments are appreciated.

    Thanks,


    SELECT quote FROM brain WHERE original = 1
    0 rows returned

  • Assuming that you are doing this update in a stored procedure, I would do something more along the lines of this psudo code:

    ... prep work

    set transaction isolation level serializable;

    begin transaction

    begin try

    truncate destination_table;

    insert into destination_table;

    commit transaction

    end try

    begin catch

    rollback transaction

    ... other error code as needed

    end catch

    end -- end of update procedure.

  • This actually sounds like a case for partitioning. Partition the main table, make sure the staging table is identical (indexes, constraints, columns, etc) then to move the data from staging into main you could just add a new partition to the main table and switch that new partiton with the staging table. Metadata operation, near instant.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (6/21/2012)


    This actually sounds like a case for partitioning. Partition the main table, make sure the staging table is identical (indexes, constraints, columns, etc) then to move the data from staging into main you could just add a new partition to the main table and switch that new partiton with the staging table. Metadata operation, near instant.

    This works really fast. Only thing is, you need to be using the Enterprise Edition of SQL Server. What edition are you using?

  • Yes, I am using Enterprise edition. Partitioning is not something I have yet experimented with but that sounds like a great solution. Thanks for the idea.


    SELECT quote FROM brain WHERE original = 1
    0 rows returned

  • Since you are using Enterprise Editon, I'd go with Gail's suggestion.

  • GilaMonster (6/21/2012)


    This actually sounds like a case for partitioning. Partition the main table, make sure the staging table is identical (indexes, constraints, columns, etc) then to move the data from staging into main you could just add a new partition to the main table and switch that new partiton with the staging table. Metadata operation, near instant.

    Just a quick question regarding this method. This loads the 'new' data into the main table quickly but you still end up with the old data in that table. As such could result in unwated data...or perhaps I read it incorrectly.


    SELECT quote FROM brain WHERE original = 1
    0 rows returned

  • yb751 (6/21/2012)


    GilaMonster (6/21/2012)


    This actually sounds like a case for partitioning. Partition the main table, make sure the staging table is identical (indexes, constraints, columns, etc) then to move the data from staging into main you could just add a new partition to the main table and switch that new partiton with the staging table. Metadata operation, near instant.

    Just a quick question regarding this method. This loads the 'new' data into the main table quickly but you still end up with the old data in that table. As such could result in unwated data...or perhaps I read it incorrectly.

    You would also swap out the old data. This means each set of data will have to be in its own partition.

  • Hmmmm...just quickly reading over msdn articles on switching it seems you can only switch a partition with an empty one. So then I would see it as a three partition cycle.

    1. Staging -> Holding

    2. Main -> Staging

    3. Holding -> Main

    4. Delete Staging

    or perhaps simply

    1. Main -> Holding

    2. Staging -> Main

    3. Delete Holding

    Thanks folks, I'll have to do some more reading and testing but it sounds promising.


    SELECT quote FROM brain WHERE original = 1
    0 rows returned

  • Typically it's done like this:

    Load new data into staging

    Add a new partition to main

    Switch new partition in main and staging (now staging is empty)

    Switch partition containing old data with staging

    Merge empty partition in main table

    Truncate staging.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Even better...that's fantastic Gila.

    Thank You


    SELECT quote FROM brain WHERE original = 1
    0 rows returned

  • Sorry one more question if I may.

    What would be the best stategy for creating a new partition on the main table in such a way that I would not have to recreate the partition function and scheme every time I run the stored procedure?


    SELECT quote FROM brain WHERE original = 1
    0 rows returned

  • Been a while since I even created a partitioned table, but bassed on the information provided so far, I have absolutely no idea, not even a shot in the dark.

    It would help if you would provide the current DDL (CREATE TABLE statement) for the table, including indexes.

  • yb751 (6/21/2012)


    Sorry one more question if I may.

    What would be the best stategy for creating a new partition on the main table in such a way that I would not have to recreate the partition function and scheme every time I run the stored procedure?

    You can't, not easily. You could 'rotate' the partitions by using a contrived control value for them, but I don't recommend it as it defeats some of the optimization techniques used under the hood to help partitioning be seamless to the end user's speed.

    You nearly always end up programatically rebuilding the partition function and scheme.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Evil Kraig F (6/21/2012)


    yb751 (6/21/2012)


    Sorry one more question if I may.

    What would be the best stategy for creating a new partition on the main table in such a way that I would not have to recreate the partition function and scheme every time I run the stored procedure?

    You can't, not easily. You could 'rotate' the partitions by using a contrived control value for them, but I don't recommend it as it defeats some of the optimization techniques used under the hood to help partitioning be seamless to the end user's speed.

    You nearly always end up programatically rebuilding the partition function and scheme.

    But we can give you a better answer, if you provide us with more details.

Viewing 15 posts - 1 through 15 (of 23 total)

You must be logged in to reply to this topic. Login to reply