24/7 Mulitple Table Design

  • Hi All

    I will apologise this will be a little vague but is all the info I can give at the moment.

    I have a few ideas however I'm not going to present those here as I want fresh ideas..

    Consider this:

    Two tables across two sites and servers

    They are a duplicate of one-another in structure

    There is an ID in each table (auto inc)

    Each table HAS to have data from it inserted in the other duplicate table WITHIN SECONDS ideally

    If one server goes down we need to be able to switch over INSTANTLY between one server to another to hit this table with updates / inserts / selects AND containing

    the data from the other table up-to the point of failure..

    If the above happens then ALL data not included in one table while it was down HAS to be inserted into the other table once it is back online

    Report off these tables (be it a separate DW or server sky is the limit here as long as the latency is minimum (seconds) ) 24/7

    Require a method of distinguishing where the data has come from table A or B

    There are many columns in these tables all VARCHAR(whatever) and DATETIME

    Need to be able to allow one client to hit a primary table for INSERTS / UPDATES and SELECTS (be it A or B as long as data is replicated to the other table) and be able to switch these AT ANY TIME

    Need a PK in these tables that is uniform if possible i.e table A has PK ID of 1 and 2 and 5 then table B will not have either and start at 3 - 4 and restart at 6

    As I said vague yes but at the moment it is the best I can do..

    The end game a 24/7 solution for inserts ect plus reporting and each table requires to hold data from the other 24/7

    Ideas anyone!!!!

    Cheers

    Andy

    ==========================================================================================================================
    A computer lets you make more mistakes faster than any invention in human history - with the possible exceptions of handguns and tequila. Mitch Ratcliffe

  • Some rough ideas:

    1) if the requirement that each table contains everything inserted in the other table up to the point where the server supporting that other table crashes, you need to update the remote server in the same transaction as the local one; that is potentially poisonous, because it means that the transaction (which will hold at least one page lock as it's updating a page) will hang up until it knows the other server is dead, which means other rows can't be inserted until the remote access finally fails and maybe that delay is unacceptable, and you may have to commit the transaction even though the remote part has failed if insertion order is important(ie if that row must not get an identifier that implies it was later than any other rows that were already queued on the page lock at the time the remote failure was discovered). Also, transactions originating in the other machine will hold page locks on this machine, which may cause similar problems when the remote machine fails - so you want the remote machine to be inserting on a page that the local machine won't use, and so that that won't cause performance issues (too much head movement) the table wants to be partitioned so that the two machines hit different partitions which are held on different drives.

    2) If the requirement mentioned at 1 above is slacker - so that some rows that the remote machine created before a crash may not get to the local machine until the remote machine has recovered as long as any row created more than a couple of seconds ago is on the local machine - distributed transactions are not needed, which makes everything much easier. Each machine can have a process that polls the remote machine for new rows (recognised by identity value) in one transaction and then inserts them in the local table (using identity insert on) in a separate transaction. In this case you may be able to afford for the local rows and the remote rows to hit the same page, but it's probably still safer to try to separate them.

    3) identitity values can be used to indicate which machine a row was created on, and also if desired to partition the table. Using identity(0,-1) on one machine and identity(1,1) on the other is the obvous way to do this. If on the other hand you don't want partitioning because it's an unneccessary complication but still want to avoid excessive head movement, or need to partition on something completely different, you could use identitity(0,2) on one machine and identity(1,2) on the other and cluster on identity.

    4) Maybe each machine could have two tables: stuff it created and stuff the other machine created, for the purpose of inserting stuff; while everything else (including all reporting) sees a view which is the union of the two tables. You could position these two tables on different filegroups. That's simpler than partitioning.

    Tom

Viewing 2 posts - 1 through 1 (of 1 total)

You must be logged in to reply to this topic. Login to reply