Auto-Deleting Data

  • Comments posted to this topic are about the item Auto-Deleting Data

  • Personally I think auto-deleting data is a very bad idea - kind of burning the library of Alexandria. Not all data is valuable on its own however in context and with lots of points in data this brings back a picture of a civilisation.
    In thousand years the issue of PII now is irrelevant. However someone looking for ancestors might just get lucky to know his foreforeforefather if that record still exist in some retrievable form of which we today think it is irrelevant.

    The Sumerians pretty much had the same approach to their clay tablets. Good for some time, then break up and use as old tiles for the roof maybe.
    Only reason we know of their accounting are a few of these clay tablets which allowed us to decipher their language.

  • Wow, Steve, you definitely pushed my buttons with this one. It is my humble opinion ( note I don't use the vernacular there ) that the evolution of mass storage has lead to a drastic decline in the skillset to properly analyze, summarize, and archive data.  It's easier to throw money at storage than to logically,  properly and safely handle data volume. 

    First of all, there needs to be logical analysis of what level of detail and age of data actually needs to be kept online and instantaneously available.  Such things as customer purchase history of product that is no longer available.  Detail of price paid thirty years ago.  Where the product was shipped then.  This requires an analytical skill to be properly accomplished.

    Secondly, proper storage of detail seems to be currently way out of control.  While the above mentioned things may be required legally and for business decisions, we don't need to be paying for mass storage online.  Several of my early positions in IT required the use of off-line data storage for bulk data that could be retrieved IF and WHEN needed.  And by off-line, I mean that literally, not just moved to another live system.  We were simply expected to design this into our systems.  And this required not only proper design of systems themselves but also the physical location and safety of the data.  I haven't seen the old magnetic tape storage in decades, but it did serve a huge purpose in one-time cost, ease of off-premises storage but with acceptable accessibility within a reasonable time-frame.  Today a removable disk drive can contain many, many times the volume at reasonable cost. 

    In the 'old days', it was often the case that local business would actually reciprocate and trade off-premises storage of bulk data.  If summarization is adequately designed, appropriate detail may be identified and made available within a very reasonable timeframe.  

    IT professionals need to get over the idea that you can just pay ongoing costs for data storage and focus on designing both logical and physical storage with appropriate detail available for appropriate timeframes.  There definitely still is a place of off-line bulk data storage.

    Rick
    Disaster Recovery = Backup ( Backup ( Your Backup ) )

  • Knut Boehnert - Tuesday, May 1, 2018 2:18 AM

    Personally I think auto-deleting data is a very bad idea - kind of burning the library of Alexandria. Not all data is valuable on its own however in context and with lots of points in data this brings back a picture of a civilisation.
    In thousand years the issue of PII now is irrelevant. However someone looking for ancestors might just get lucky to know his foreforeforefather if that record still exist in some retrievable form of which we today think it is irrelevant.

    The Sumerians pretty much had the same approach to their clay tablets. Good for some time, then break up and use as old tiles for the roof maybe.
    Only reason we know of their accounting are a few of these clay tablets which allowed us to decipher their language.

    However, in a few years, the issues of PII are drastically relevant. I shudder to think how many children will have issues in adulthood because their data is out there now. I would argue plenty of information about culture and about ourselves is already being saved by individuals. Not sure organizations or corporations need to be able to do so.

  • skeleton567 - Tuesday, May 1, 2018 7:02 AM

    Wow, Steve, you definitely pushed my buttons with this one. It is my humble opinion ( note I don't use the vernacular there ) that the evolution of mass storage has lead to a drastic decline in the skillset to properly analyze, summarize, and archive data.  It's easier to throw money at storage than to logically,  properly and safely handle data volume. 

    ...

    I do agree with some offline storage, but likely some of that still should go. The price paid or shipping location for a decade ago (or any PII data)? Not sure there's any legal reason to keep this, and I'd argue that a business reason is probably outweighed by risk for liability.

  • Another thought regarding the use of off-line storage for archival purposes is that the technology, both physically and logically has to be kept up to date along with current systems.  In the old days, once you committed data to paper, it was no longer a worry for IT people.  This is going to affect system redesigns and upgrades, and never should be postponed for a more convenient time.

    Rick
    Disaster Recovery = Backup ( Backup ( Your Backup ) )

  • It makes perfect sense to retain things like sales orders, financial transactions, court records, and health records in digital format indefinitely, because there may be an essential need for this historical information at a later point. But this type of normalized transactional data doesn't consume nearly as much storage as stuff like surveillance video, IoT telemetry, click stream, and binary objects.

    The fact Susan deposited $100 to Tom's bank account back in 2004 may still be relevant today or even 10 years from now. The fact that Susan liked Tom's comment on FaceBook yesterday has questionable value today and will be totally irrelevant 10 years from now.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • In general I don't believe in deleting relevant data, especially after working in places where even seemingly minor records remain relevant and end up being far more important that you'd think.  In general I think that records involving monetary transactions and the exchange of goods and services never stop being relevant.

    In regards to email I will say that where I work now we have a 3 month retention policy on normal email with the option to archive things for far longer.

  • Steve Jones - SSC Editor - Tuesday, May 1, 2018 7:28 AM

    skeleton567 - Tuesday, May 1, 2018 7:02 AM

    Wow, Steve, you definitely pushed my buttons with this one. It is my humble opinion ( note I don't use the vernacular there ) that the evolution of mass storage has lead to a drastic decline in the skillset to properly analyze, summarize, and archive data.  It's easier to throw money at storage than to logically,  properly and safely handle data volume. 

    ...

    I do agree with some offline storage, but likely some of that still should go. The price paid or shipping location for a decade ago (or any PII data)? Not sure there's any legal reason to keep this, and I'd argue that a business reason is probably outweighed by risk for liability.

    OK, just for instance, knowing that a decade ago a customer bought $1000 worth of a product probably was a much larger volume of the product than $1000 of the same product now, and that get's into the whole area of cost of storage, sales, transportation, and profitability, all adjusted for inflation.  You can tell I'm still a detail-oriented obsessive-compulsive, right?

    Rick
    Disaster Recovery = Backup ( Backup ( Your Backup ) )

  • Good questions you've raised, Steve. Like you, I tend to hold onto emails for long times. At my old job we had fewer resources, so we were involved in getting rid of a lot of the older data. We used to have a huge room full out paper documents dating back 12 years or more. We were involved in converting all of that data into electronic format. Then after that we never saved any paper documents. Everything that involved a client's signature was scanned in, then shredded. But even the electronic format we couldn't keep, just didn't have the storage. I think the adult clients we were required to keep for 12 years. Adolescent clients we were required to keep for 18 years. After that, we had to archive and remove.

    At my current job they have deeper pockets and a mindset to hold onto everything. So, at this point everything is stored forever, or at least practically forever. I do wonder, though, how we would handle something like GDPR, with the "right to be forgotten". It would be a huge paradigm shift here.

    Kindest Regards, Rod Connect with me on LinkedIn.

  • People store family photos on their cell phones - disposed of or lost.

    Companies store transactions on their storage arrays - expired and deleted.

    I keep emails of personal and familial significance in a well-organized set of folders - inaccessible and forgotten about when I pass.

    We are living in a time where we store information in unprecedented amounts, yet it will be a black hole to generations in the distant future.
    We have a serious misconception that data lives forever. It only lives as long as technologists maintain it.

  • Small text emails are not really a problem in terms of space. The issue with email is when large attachments get replicated many times within a company and it does actually become a space issue. My company has not instituted any retention policies, but I can certainly see problems in the future. Of course, nobody has the time to actually decide what is important and what is not, therefore, everything is saved in the event that it might be important.

    The other issue is with data associated with an application. When the application has been retired or replaced, are we required to keep the software around that can make sense of the data? As an example, we replaced our ERP system 10 years ago. A few years after the migration we told accounting that data would go away in a certain amount of time and they should run reports to save either on paper or as PDF's for audit purposes.

  • What is nice about virtualization is that it captures the entire system to report data from old systems. Archiving won't capture everything, but it will make certain data points available. Right now the time series it will capture is decades long. It isn't forever.

    412-977-3526 call/text

  • Whenever I hear about "The Right to be Forgotten", I think about that 1995 movie, The Net, where Sandra Bullock has her digital identity erased by hackers. That movie introduced some themes that were ahead of it's time.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • lburleso - Tuesday, May 1, 2018 9:17 AM

    People store family photos on their cell phones - disposed of or lost.

    Companies store transactions on their storage arrays - expired and deleted.

    I keep emails of personal and familial significance in a well-organized set of folders - inaccessible and forgotten about when I pass.

    We are living in a time where we store information in unprecedented amounts, yet it will be a black hole to generations in the distant future.
    We have a serious misconception that data lives forever. It only lives as long as technologists maintain it.

    Iberleso, you provided an excellent example what with the photos on cell phones.  This is definitely 'data' that needs to be preserved.  And you illustrated my earlier remarks about our responsibility to analyze needs, provide and keep current the appropriate protections.  Obviously, it's not all that desirable to try to 'summarize' personal photos, or to keep only a few.  I'm currently in the middle of a project to actually do the preservation of photos back to the first days that photography even existed, and update the technology for future generations.  Problem number 1, obviously is the media.  the bulk of this 'data' is currently on paper prints in bound photo albums and storage boxes of 35mm slides.  I just checked and I alone have eight 10-ream computer paper boxes of old family photos to be updated.  My sister also has a whole shelf unit of photo albums.  Then there are in excess of 20k 35mm slides to be scanned, digitized, and identified.  Problem number 2 is that there is only one copy, so only the family member who physically possessed them has the access.  Problem number 3 is the safety risk.  Several years ago my brother's home burned to the ground, so everything he had along these lines is lost.  Further, much of the old 8mm movie film that is left is warped and no longer able to be run through a projector, which currently no one in the family even possesses.

    Many of the photos are unfortunately glued to  album pages, and some albums are hard-bound, making it difficult to scan pages.  Then when scanning is completed, full pages must be separated into individual pictures.  Further, lots of identifying info is either written on the album pages or even on the back of prints. 

    The solution is truly time-consuming and highly labor-intensive.  Full pages get scanned, then edited to crop out individual photos, which are then appropriately sized, manually edited to remove age blemishes such as surface cracks, and improve contrast and focus.  Finally the 'canvas' is enlarged and appropriate caption detail is typed in so it becomes an integral part of the digital file and visible while not covering the content of the photo. This process will work for both the old black-and-white paper photos and the 35mm color slides alike and combine them into the same media type.   This solution provides for:  a tiny fraction of the storage space and cost, the ability to have individual access for all family members using distributed copies, remote storage of multiple copies for safety and security, let alone the preservation of historical data in a current technology.  In this example you can see the need of actually preserving all available detail but in an efficient and useable format. 

    What it all boils down to is the benefit versus cost, a workable solution, and a good analysis of the situation.  As usual, the older technology is the largest part of the problem, and thus the most important to try to upgrade at any point before it gets out of the realm of possibility.

    Rick
    Disaster Recovery = Backup ( Backup ( Your Backup ) )

Viewing 15 posts - 1 through 15 (of 22 total)

You must be logged in to reply to this topic. Login to reply