Archiving

  • Comments posted to this topic are about the item Archiving

  • I start many of my mornings looking forward to my daily reading of the SQLServerCentral email, and frequently use it to leap into a new topic and spend twenty of thirty minutes learning something new. Over the years, I've occasionally help deal with archiving issues, and often the question arises: "Is the data older than 'x' still of any business use, or is the data still loved?" Very often, old data is maintained because it is forgotten and only the dba sees the slow accumulation.

    This morning, I can't help feeling that the there is a link missing attached to the sentence in today's editorial. Should "This article talks about the problem of data access from a storage point of view,[...]" point to something?

  • One of my primary projects is developing and maintaining a reporting datamart for the accounting department. When a report is run, a dataset containing key and fact columns are queried from the corporate ODS (operational data store) into what we call interface tables and retained there for archival purposes. Each reporting dataset is keyed on a report header table with columns that include the runtime parameters, time the query started, when it completed, and text of the SQL SELECT statement. When called upon, we can go back and retreive the exact dataset for any report produced in the past as well as answer questions about it's runtime performance or changes in SQL coding over time. That's a technique that I developed early in my career and it has proven it's worth on many projects. The key point here is that we're not tracking every change that occurs in the ODS (that's more or less maintained by the ODS team), the accounting datamart only archives data aggregated for the department's purpose, so it's kept relatively small and manageble.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • I get wrapped into a tight loop on this subject. As an industry we have data retention rules and regulations, either on us directly or the customers we serve. I'm OK with that (like there was a choice) as far as it goes. Still a close examination of the requirements serves well. If microfilm or other offline storage is allowed -- do it. Often what is allowed is that the information must be electronically available but its OK to make it the subject of a targeted search. That looks like a separate database to me.

    For many of us we have huge servers with tremendous storage capabilities. For folks like me we are often confronted with Express or the Compact Edition. In those environments size matters. The smaller the better. We also are in a position such that large parts of the database are intermediary in nature. We take a sales order in the field on a device and upload that to our server, then it gets exported to the back end system (SAP, PeopleSoft, Dynamics (Great Plains), whatever). The order gets delivered to the customer and order confirmation get funneled up the same way. The back end system has it, why is it still in my database after 6 or more years?

    Then if you get bored try this. In a design meeting when the organizer asks if there is anything not covered in the specifications, raise your hand an utter the three deadly words. DATA REMOVAL POLICY. Then duck as stuff will come flying in your direction.

    ATBCharles Kincaid

  • Rich Weissler (6/20/2011)


    This morning, I can't help feeling that the there is a link missing attached to the sentence in today's editorial. Should "This article talks about the problem of data access from a storage point of view,[...]" point to something?

    My apologies. The cut/paste from OneNote last stopped bringing cross links for some reason.

    I've edited things and here is the link http://itknowledgeexchange.techtarget.com/storage-soup/dont-turn-data-archiving-into-data-hoarding/?track=NL-52&ad=835610&asrc=EM_NLN_13976274&uid=2052671

  • Charles Kincaid (6/20/2011)


    In a design meeting when the organizer asks if there is anything not covered in the specifications, raise your hand an utter the three deadly words. DATA REMOVAL POLICY. Then duck as stuff will come flying in your direction.

    I see your Data Removal Policy, and raise you a Legal Hold. Removing old data is "good"; removing old data that is subject to discovery in a multi-year lawsuit is... "not good".

    In general, I agree that archiving (not backups, but archiving) and removing data past its retention policy is a good idea, but this gets very deep into data classification, very fast; it's important, but often "too difficult" to deal with, and as such it never gets done.

  • Data Archival is a process that should always be given as much weight at the original design of the database as other factors are, but sadly is not. I am not sure whether this due to indifference or complacency at design time, or just flat out ignorance. 😀

    "Technology is a weird thing. It brings you great gifts with one hand, and it stabs you in the back with the other. ...:-D"

  • I agree with points from both Nadrek and TravisDBA.

    Right on with having to hold for legal reasons. I got that. Many of my customers are food producers or food handlers. A lot of the time they deal in frozen product. Looking at the work of Clarence Birdseye it is easy to see where one could have frozen product in inventory for over a hundred years. Then if you sell that product to a retailer who keeps in in good frozen condition it might be on their shelves for another hundred years. In the case of a product recall I might be forced to keep that data the whole time. Medical and Tax records have to be retained. Each with their own legal requirements.

    Travis, I fear that it is fear that drives the resistance. Fear generated by the above. Developers don't want to raise costs by having a data removal policy in the specifications somebody has to go research the legal requirements

    ATBCharles Kincaid

  • Charles Kincaid (6/24/2011)


    Travis, I fear that it is fear that drives the resistance. Fear generated by the above. Developers don't want to raise costs by having a data removal policy in the specifications somebody has to go research the legal requirements

    Very good point Charles, but your point is making my point on ignorance. In the immortal words of Arnold Heny Glasow, an American Humorist (1905-1998) "Fear is the lengthened shadow of ignorance". 😀

    "Technology is a weird thing. It brings you great gifts with one hand, and it stabs you in the back with the other. ...:-D"

  • Charles Kincaid (6/24/2011)


    Looking at the work of Clarence Birdseye it is easy to see where one could have frozen product in inventory for over a hundred years. Then if you sell that product to a retailer who keeps in in good frozen condition it might be on their shelves for another hundred years.

    Heh... my first point is... sometimes it tastes that way. 😛

    Although I certainly understand the jist of the Birdseye example, my suggestion is that it's not a practical example. If they have such old inventory, then they have a [font="Arial Black"]much [/font]larger problem than figuring out when to archive data and take it off-line. 🙂

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (6/25/2011)


    Charles Kincaid (6/24/2011)


    Looking at the work of Clarence Birdseye it is easy to see where one could have frozen product in inventory for over a hundred years. Then if you sell that product to a retailer who keeps in in good frozen condition it might be on their shelves for another hundred years.

    Heh... my first point is... sometimes it tastes that way. 😛

    Although I certainly understand the jist of the Birdseye example, my suggestion is that it's not a practical example. If they have such old inventory, then they have a [font="Arial Black"]much [/font]larger problem than figuring out when to archive data and take it off-line. 🙂

    Exactly, archiving data is the least of their issues. What ever happen to using FIFO and checking shelf life? Companies generally use the oldest items in inventory first so they can continually roll the stock and prevent deterioration or obsolescence. FIFO has been widely used and accepted for obvious reasons. This is kind of fundamental guys :w00t:

    "Technology is a weird thing. It brings you great gifts with one hand, and it stabs you in the back with the other. ...:-D"

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply