Archive Every Day

  • Steve Jones - SSC Editor

    SSC Guru

    Points: 720952

    Comments posted to this topic are about the item Archive Every Day

  • akljfhnlaflkj

    SSC Guru

    Points: 76202

    We haven't solved this problem so we don't archive. We just let the file data grow. Fortunately we have a relatively small set of data, but it is growing. We do need to address it eventually.

  • Eric M Russell

    SSC Guru

    Points: 125100

    This reminds me of a discussion from a few months back:

    http://www.sqlservercentral.com/Forums/FindPost1673628.aspx

    One approach is sliding window table partitioning. The rows can also age off into a ReadOnly partition. All rows can be queried seamlessly, and it takes advantage of partition elimination.

    Another simple approach I've used is having a scheduled process that periodically deletes rows from the primary that have aged past a certain date/time, and then leverage the OUTPUT clause to insert those same rows (or just a subset of columns) into an archive table. The entire operation is completed with a single statement and transaction.

    For example:

    delete from PrintJobs

    output deleted.JobID, deleted.PrintDate

    into PrintJobsHistory ( JobID, PrintDate )

    where datediff( day, PrintDate, getdate() ) > 5;

    If full query access to history is needed, a paritioned view can unionize archived rows with the primary table.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Steve Jones - SSC Editor

    SSC Guru

    Points: 720952

    I like the use of the OUTPUT clause there. Much simpler than trying to build a transaction. It would also work with larger volumes if you changed the where clause in there.

  • Eric M Russell

    SSC Guru

    Points: 125100

    Steve Jones - SSC Editor (6/9/2015)


    I like the use of the OUTPUT clause there. Much simpler than trying to build a transaction. It would also work with larger volumes if you changed the where clause in there.

    That approach is very simple, but physically deleting rows leads to table and index fragmentation. Sliding window paritioning may prove to do a better job of maintaining internal organization of the table since it performs partition switching (a meta data update) rather than physically deleting rows.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Lynn Pettis

    SSC Guru

    Points: 442359

    Would be nice if Microsoft brought partitioning to all editions of SQL Server, well, maybe not Express but it wouldn't hurt for people experimenting.

  • Eric M Russell

    SSC Guru

    Points: 125100

    Lynn Pettis (6/9/2015)


    Would be nice if Microsoft brought partitioning to all editions of SQL Server, well, maybe not Express but it wouldn't hurt for people experimenting.

    I use a local install of Developer Edition for experimenting.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Jeff Moden

    SSC Guru

    Points: 997128

    Lynn Pettis (6/9/2015)


    Would be nice if Microsoft brought partitioning to all editions of SQL Server, well, maybe not Express but it wouldn't hurt for people experimenting.

    It's a little different and a little more detailed but partitioned views work in both Standard and Enterprise editions and a lot of similar partitioning "tricks" can be had including "switching" partitions in and out simply by rebuilding the view.

    In some aspects, partitioned views can be "easier" than partitioned tables and have some advantages of partitioned tables. Of course they also have some disadvantages but those all have some pretty easy work arounds.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".
    "Dear Lord... I'm a DBA so please give me patience because, if you give me strength, I'm going to need bail money too!"

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden

    SSC Guru

    Points: 997128

    From the article:


    Maybe we should have partitioning implemented early on, though the edition limits on that technology are problematic.

    Somehow, it seems, that partitioned views have fallen out of vogue even in the Standard Edition. From personal experience, I can tell you that they're not difficult to master ( although if you think "Well, that was easy", then you've probably done it wrong), have many of the advantages of partitioned tables (including fast "switches" in and out if you think about it a little bit), have some advantages that partitioned tables don't, and work one hell of a treat on larger tables or tables that will get large when it comes to backup and restores if you do it right. And, yes, they can be fully automated.

    Hmmm... yet another article that I need to add to my dance card.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".
    "Dear Lord... I'm a DBA so please give me patience because, if you give me strength, I'm going to need bail money too!"

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • David.Poole

    SSC Guru

    Points: 75394

    I've got a slightly bigger problem. How to archive a Teradata appliance.

    The solution I've gone with is Hadoop. It ain't cheap but it is cheaper than having to expand the Teradata box. It is actually buying us time to consider how to get the best out of Teradata.

    Much of the time an archive is there "just in case". Quite often when you dig into the business requirements for an archive solution it is remarkably sparse. There are few, if any performance NFRs or RTO, RPO criteria. The stated requirements could be summed up as store data securely for 'n' years and be able to bring it back in the event of a compliance query.

    With that level of requirements where is the onus on me to store the data in expensive high performance equipment?

    Then throw into the mix that the few queries that are turn against archive data are the sort where a slow running batch process is perfectly acceptable. MapReduce (or more accurately HIVE SQL query) = slow running batch process.

Viewing 10 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply