Keep It All

  • Comments posted to this topic are about the item Keep It All

  • According to the new European GDPR law, keeping data that you're not supposed to keep is significantly more expensive.

  • How appropriate given the current furore in the UK over the disposal of landing cards of the "Windrush Generation" and subsequent inability to prove right to remain.
    Love SQL Server Central!

  • I believe that Big Data is to a large extent over-hyped. Storage and analytics software providers are kind of like the guys who were selling shovels during the California gold rush. Not only do they profit from Big Data hype, but they also perpetuate the Big Data hype.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • There is a cost to cleaning up: determining what data you need to keep and what is safe to throw away.  Unless the clean up is a habit ingrained into the work process, it's difficult to justify the time spent to go back and review the data.

  • leddybill - Friday, April 20, 2018 7:57 AM

    There is a cost to cleaning up: determining what data you need to keep and what is safe to throw away.  Unless the clean up is a habit ingrained into the work process, it's difficult to justify the time spent to go back and review the data.

    This is very true, and whilst GDPR might feel overwhelming, I'd argue it (and SOX, ISO,PCI, etc) are things we ought to have been doing already. We should ingrain many of the policies of privacy, protection, retention, into our daily work habits and development practices.

  • Shifting gears a bit, we're setting up to delete 1TB from a 1.4TB database and 600GB from a 1TB database.  The data we're dropping is simply data that no one thought of deleting.  Most of the data consists of imported data that has been successfully processed and people wanted to keep the original imported data for troubleshooting but no one set it up to do eventual deletes.  All of the data has been successfully processed and the original data is also available in compressed files elsewhere.  No one ever even considered deleting the data because there was always "enough room" do not worry about it and they didn't. 😀

    Once the initial drops are done using the ol' dosie-doe method of copying only what you need to a temporary file group, truncating the tables (after a whole bunch of FK stuff to worry about), shrinking the database, rebuilding the indexes for what remains, and then moving the data back to the original file group and dropping the temporary, we'll have nightly sliding window deletes running so this doesn't happen again.  I'm considering temporal partitioning to take advantage of SWITCH but it'll be a real bugger with as many tables are involved.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Steve Jones - SSC Editor - Friday, April 20, 2018 8:07 AM

    leddybill - Friday, April 20, 2018 7:57 AM

    There is a cost to cleaning up: determining what data you need to keep and what is safe to throw away.  Unless the clean up is a habit ingrained into the work process, it's difficult to justify the time spent to go back and review the data.

    This is very true, and whilst GDPR might feel overwhelming, I'd argue it (and SOX, ISO,PCI, etc) are things we ought to have been doing already. We should ingrain many of the policies of privacy, protection, retention, into our daily work habits and development practices.

    There are costs on both sides.  continually backing up junk that doesn't provide any further value to your business is a cost;  queries and servers having to wade through 30 years of data to pull back only the last three months is also a cost.  Don't forget to factor in the potential liability if someone were to steal that old data or old lawsuits show up demanding old data you still have, and the breakeven analysis starts to make a lot of sense.

    It's truly amazing what your average company might still have lying around.  We went through an imaging initiative a decade ago and yet we still have warehouses' worth of paper records we just "forgot" to shred, even though the cost of renting the storage space alone is many times that of the disposal.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Unlike Google, FaceBook and Cambridge Analytica; most organizations are not in the business of stockpiling data. A restraunt chain or furniture manufactorer is not in the data business any more than they are in the accounting or realestate business. Yes, they need data the same way they need accountants and retail space, but it's just a means to an end and more doesn't necessarily mean better.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Eric M Russell - Friday, April 20, 2018 11:02 AM

    Unlike Google, FaceBook and Cambridge Analytica; most organizations are not in the business of stockpiling data. A restraunt chain or furniture manufactorer is not in the data business any more than they are in the accounting or realestate business. Yes, they need data the same way they need accountants and retail space, but it's just a means to an end and more doesn't necessarily mean better.

    Matt Miller (4) - Friday, April 20, 2018 10:00 AM

    ... Don't forget to factor in the potential liability if someone were to steal that old data or old lawsuits show up demanding old data you still have, and the breakeven analysis starts to make a lot of sense.

    Here's another cost: Certain must be kept for legal reasons, but for other materials there is considerable latitude for deletion AS LONG AS the company has a consistent deletion policy. If ever there is a legal requirement for discovery, you can be required to search ALL data you have preserved, and that's a costly bunch of billable hours. Better to search 5 years than 30 years.

    ...

    -- FORTRAN manual for Xerox Computers --

  • jay-h - Friday, April 20, 2018 11:30 AM

    Eric M Russell - Friday, April 20, 2018 11:02 AM

    Unlike Google, FaceBook and Cambridge Analytica; most organizations are not in the business of stockpiling data. A restraunt chain or furniture manufactorer is not in the data business any more than they are in the accounting or realestate business. Yes, they need data the same way they need accountants and retail space, but it's just a means to an end and more doesn't necessarily mean better.

    Matt Miller (4) - Friday, April 20, 2018 10:00 AM

    ... Don't forget to factor in the potential liability if someone were to steal that old data or old lawsuits show up demanding old data you still have, and the breakeven analysis starts to make a lot of sense.

    Here's another cost: Certain must be kept for legal reasons, but for other materials there is considerable latitude for deletion AS LONG AS the company has a consistent deletion policy. If ever there is a legal requirement for discovery, you can be required to search ALL data you have preserved, and that's a costly bunch of billable hours. Better to search 5 years than 30 years.

    Given all the email leaks or legal subpoenas (DNC, Sony, Enron, etc.), it's understandable why many folks choose to discuss sensitive matters in person, insuring there is no electronic record in the first place. Back in the day, business and political leaders would meet on the golf course because they didn't trust phone conversations, but with the advent of the internet folks let their guard down, perhaps thinking that emails are somehow more secure. Now we have always connected phones that actively track our location and listen for commands. Perhaps it's time to leave the phone in the glove compartment and start rendezvousing in person at the golf club or park bench again.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Little Milton classic... a warning for our time

    https://www.youtube.com/watch?v=REoP_vUMnoc

    Started thinking about this reading how JP Morgan used Palantir to spy on all aspects of employees' lives including friends and contacts.

    ...

    -- FORTRAN manual for Xerox Computers --

  • Eric M Russell - Friday, April 20, 2018 12:15 PM

    jay-h - Friday, April 20, 2018 11:30 AM

    Eric M Russell - Friday, April 20, 2018 11:02 AM

    Unlike Google, FaceBook and Cambridge Analytica; most organizations are not in the business of stockpiling data. A restraunt chain or furniture manufactorer is not in the data business any more than they are in the accounting or realestate business. Yes, they need data the same way they need accountants and retail space, but it's just a means to an end and more doesn't necessarily mean better.

    Matt Miller (4) - Friday, April 20, 2018 10:00 AM

    ... Don't forget to factor in the potential liability if someone were to steal that old data or old lawsuits show up demanding old data you still have, and the breakeven analysis starts to make a lot of sense.

    Here's another cost: Certain must be kept for legal reasons, but for other materials there is considerable latitude for deletion AS LONG AS the company has a consistent deletion policy. If ever there is a legal requirement for discovery, you can be required to search ALL data you have preserved, and that's a costly bunch of billable hours. Better to search 5 years than 30 years.

    Given all the email leaks or legal subpoenas (DNC, Sony, Enron, etc.), it's understandable why many folks choose to discuss sensitive matters in person, insuring there is no electronic record in the first place. Back in the day, business and political leaders would meet on the golf course because they didn't trust phone conversations, but with the advent of the internet folks let their guard down, perhaps thinking that emails are somehow more secure. Now we have always connected phones that actively track our location and listen for commands. Perhaps it's time to leave the phone in the glove compartment and start rendezvousing in person at the golf club or park bench again.

    Heh... and use cash to buy the drinks. 😀

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden - Friday, April 20, 2018 1:03 PM

    Eric M Russell - Friday, April 20, 2018 12:15 PM

    Given all the email leaks or legal subpoenas (DNC, Sony, Enron, etc.), it's understandable why many folks choose to discuss sensitive matters in person, insuring there is no electronic record in the first place. Back in the day, business and political leaders would meet on the golf course because they didn't trust phone conversations, but with the advent of the internet folks let their guard down, perhaps thinking that emails are somehow more secure. Now we have always connected phones that actively track our location and listen for commands. Perhaps it's time to leave the phone in the glove compartment and start rendezvousing in person at the golf club or park bench again.

    Heh... and use cash to buy the drinks. 😀

    Doh!  :pinch:

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply