Oops, I deleted that data

Question

Oops, I deleted that data

Viewing 15 posts - 16 through 30 (of 73 total)

You must be logged in to reply to this topic. Login to reply

SQLian SSC-Addicted Points: 422 More actions · Answer 1

Minor tweak, but if it's SQL Server you could possibly replace

UPDATE account SET active_yn =0 WHERE account_code='4c'
SELECT 'AFTER' 'AFTER', * FROM dbo.account WHERE account_code='4c'

with

UPDATE account SET active_yn =0 WHERE account_code='4c'
OUTPUT 'AFTER' 'AFTER', INSERTED.*

MVDBA (Mike Vessey) SSC-Insane Points: 21797 More actions · Answer 2

kingroon wrote:

In my current role, Scripts are executed by the Release Team, so it's difficult to hand them something that requires input e.g. changing the @doCommit value to commit the update etc. However, getting an execution report back from them is easy. For this reason, I tend to use something similar to:.....

I like this - I might just add an "OUTPUT deleted.*" into the update statement

MVDBA

mmullane Grasshopper Points: 11 More actions · Answer 3

Like Freddie, I have developers save the data to be modified to a utility database on the same server. This has a few benefits. The SELECT INTO query can be used to confirm the modification logic, rollbacks are relatively easy and there's a saved history of data modifications.

Also, you can create a naming convention for the new tables in the utility database, like TableName_TicketNumber, for high traffic tables that get more frequent modifications.

I don't have them save into the same database because then you end up with a database with all sorts of cruft.

dragonharper SSC Rookie Points: 36 More actions · Answer 4

I've used a similar method many times. All it takes is trashing a database once to learn the value of writing the where clause in a select statement before updating or deleting. If it is a big change you can also back up the table with an Insert into <tblname_backup>. Just don't for get to drop the table once the data is verified.

Erwin Oosterhoorn SSCrazy Points: 2320 More actions · Answer 5

Depending on the table size and the amount of data changed/deleted/updated I use output inserted, output deleted into a table in another database. When I have reviewed the changes or deletions I remove the table that I created with the backup data.

Most of our changes are on the test system first, then BA and last on the production server, but I am aware that this is not always possible. Changes made have at least another pair of eyes while making the changes as well.

Jeff Moden SSC Guru Points: 1004679 More actions · Answer 6

Just my 2 cents...

I've always done updates the way you've posted. As some have said, it's saved my keester more than once.

I also hammer (in a mentor/nice way) the method onto the Devs and they get it. For those that worry about having both Rollback and Commit being commented out, I'll take the chance there for several reasons...

We don't actually let ANYONE (including me, the DBA) do any kind of data updates in production "directly". We enforce the rule that there MUST be a script to do the job and that it MUST be fully auditable, which means there must be a ticket that covers the action, it must have gone through the Dev, QA, UAT gauntlet (and that can be done very quickly for urgencies/emergencies), and the script MUST ALWAYS have a hard-coded expected row count (which can sometimes be a close-enough value for rapidly changing tables).
The hard-coded expected rowcount is required for several reasons. If first show due auditable due diligence on the part of the Developer and the whole process. It also forces people to pay attention during deployment to production because another auditable process we rigidly follow is that NO ONE IS ALLOWED TO DEPLOY THEIR OWN CODE TO PRODUCTION, PERIOD! Except for the addition of indexes, that includes me... the DBA.
The biggest reason for all of this, especially the hard-coded expected row counts, it because MS and SQL Server have screwed us more than once. The worst example I saw was back around 2004 when a really simple UPDATE had a WHERE clause that was supposed to limit the update to where a column had a NULL value. It worked correctly in Dev and Staging. In prod, SQL Server ignored the WHERE clause (it turned out that there was actually a CONNECT item on the problem). Instead of the code updating a little over 1,000 customer's data, it updated the data for all 1.2 million customers. The DBA never caught it because there was no indication of what the rowcount should be during the run and prior to the commit and the DBA simply doesn't have the time to review every piece of code during deployment (we also fixed that after). So the DBA committed the code because no errors occurred.
The error was discovered the next morning... after the reporting server (which has copies of all the production tables) was updated. They went to do a restore because the damage was propagated by the nightly runs... yeah... that's when they also found out that backups had been silently failing for about 3 weeks.
It took 40 people more than 10 days to get us back into business by reloading 3 weeks of files in the correct order, doing the processing on them, making a shedload of manual entries from source documents, etc, etc, and it was never 100% correct for about 6 months.

A lot of things changed at that company because of that episode including them suddenly following the full development and deployment life cycle that I'd been trying to pound into their heads for a couple of years. Most importantly, hard-code rowcounts for updates along with manual Rollback/Commits were always required because of the failure we discovered (and was documented in a CONNECT item) clearly demonstrated that we couldn't rely on SQL Server for such things.

And, I'll gladly "take the hit" when a Developer sometimes forgets to do a Rollback/Commit on the Dev box. That causes a whole lot less damage (none) than a UPDATE going awry, even in Dev. I'll have some fun with a Developer that does sometimes forget but it's always done in a friendly manner and I NEVER "come down hard on" a Developer that has done such a thing.

Shifting gears a bit, we're starting to join the "Big boy's club". We have several databases teetering on the 1TB mark and two teetering on the 2 TB mark and several tables in the 100 to 300 GB range. I've made it so that some of these larger tables live in the own databases and are referenced from the main database by synonyms. Some of those tables have actually been split to multiple "sub-tables" using one form of partitioning or another and some of the tables have actually been split to separate databases using partitioned views. Of course, the whole reason for that is to be able to very quickly do "get back in business" partial restores of one form or another. Of course, I've also been able to make some of the legacy partitions (coming up to 120 monthly partitions on one table that we have to "keep the data forever" on) READ ONLY, which also has the benefit of greatly decreasing the time and space it takes to do nightly full backups.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

gary.mazzone SSCarpal Tunnel Points: 4832 More actions · Answer 7

We backup the data we are going to change to a DB for that purpose only. We back up the key and any field we are going to modify, we name the backup table for the Ticket we are working from. We can use that for rolling back the data. We also have a SQL agent job that we run weekly that drops tables that are in that database that were created more that 6 months ago.

Miru Seshadri SSC Veteran Points: 258 More actions · Answer 8

Hello,

It is a good point to write out the updated/Deleted in XML format (or a common format) into an audit table to track the changes at a later date. This will work fine for smaller changes.

Typically, when I run a script, I 'set implicit_transactions on' as the first thing in the script. After the DML, changes can be committed or rolled back depending on the number of records updated. This will work well where there is separation of duties, where developer cannot run updates in production.

For a major change, I try to take a back up of the table, or backup a subset based on the predicate used for update.

heb1014 Hall of Fame Points: 3790 More actions · Answer 9

Before making any data change, I backup the table. Always - no matter how "trivial" the change. I keep these table backups for a period of time before dropping them.

I also like Temporal Tables, but I'm not sure how well that solution would scale when needing to handle thousands of tables.

NJ-DBA SSChampion Points: 13834 More actions · Answer 10

Where I work, I encourage people to never write a block of code which has an OPEN TRAN without a ROLLBACK or COMMIT. So instead of this:

SELECT * FROM dbo.account WHERE account_code='4c'

SELECT @@ROWCOUNT

BEGIN TRAN

UPDATE account SET active_yn =0 WHERE account_code='4c'

SELECT @@ROWCOUNT

--rollback tran

--commit tran

I would have this:

SELECT * FROM dbo.account WHERE account_code='4c'

SELECT @@ROWCOUNT

BEGIN TRAN

UPDATE account SET active_yn =0 WHERE account_code='4c'

SELECT @@ROWCOUNT

rollback tran --change to commit if you really mean it

--commit tran

The reason is that on more than one occasion script executors have mistakenly forgotten to uncomment either option- the rollback or the commit- and leave a transaction hanging out there chewing up the version store, tlog and in the worst cases blocking. Better to just rollback by default.

dbrinton 5101 Newbie Points: 1 More actions · Answer 11

I like to teach our younger DBA's and Developers is that UPDATE and DELETE statements are simply SELECT statements with an added feature. My favorite use of this idea is to change my SELECT statement into the UPDATE/DELETE instead of writing a whole new copy of that query below. For Example:

--Start with this
SELECT *
FROM dbo.Account a WHERE account_code='4c';


--***********************************************
--Run the above statement and if it returns only the rows
--you want to change then add the UPDATE/DELETE statement
--along with the TRAN pieces for added security

--Edit above statement to become this
BEGIN TRAN
UPDATE a
--SELECT *
FROM dbo.Account a WHERE account_code='4c';

--commit tran
--rollback tran

stelios21 Valued Member Points: 74 More actions · Answer 12

I like @"Jeff Moden"'s Rule 2:

wrote:

another auditable process we rigidly follow is that NO ONE IS ALLOWED TO DEPLOY THEIR OWN CODE TO PRODUCTION

I used to hate having DBA's deploy reports I designed on the report server as they had all sorts of settings and default parameter values that needed entering. Entering them by hand was the usual process back then. I would always send deployment reports along with a long list if instructions. The DBA's hated going through this (understandably) and there were always a couple of errors made. When a more senior DBA came along, he said to me that he never had to carry out deployments by hand - any deployment passed on to him was always performed via the accompanying script.

Not deploying your own code forces you to think about all the things you have to take care will not go wrong when someone else must deploy your code. That person may not have any idea of what your code does and how it should be deployed and does not need to. Not only it forces you to think, it makes you take steps to ensure bad things will not happen.

MVDBA (Mike Vessey) SSC-Insane Points: 21797 More actions · Answer 13

Jeff Moden wrote:

NO ONE IS ALLOWED TO DEPLOY THEIR OWN CODE TO PRODUCTION, PERIOD! Except for the addition of indexes, that includes me... the DBA.

when I worked as a test consultant for database performance we used to call this "marking your own homework" - never ever deploy unless somebody has sanity checked it - and even then let the other guy press F5

MVDBA

Thomas Franz Hall of Fame Points: 3957 More actions · Answer 14

When your 500 GB database would use multiple filegroups (and the bigger tables are ideally partitioned into different FGs too), then you would be able to do a partial restore for only the single filegroup where the faulty table sits.

You would be not able to read from tables in other filegroups and it will restore always the [PRIMARY] filegroup (for this reason NO user table / index should resists there), but instead of waiting for 4 h you would have been done in a few minutes (when the table is not on the biggest filegroup). Another benefit would be, that you could place the database files on different disks (slow HDDs for the cold data, fast SSD for the hot ones)

God is real, unless declared integer.

harikumar.mindi SSCommitted Points: 1972 More actions · Answer 15

Well that is mostly usual when developers are having direct write access to production databases specially the new joinees.

Any sort of script executions should be happening through the DBA's. Even then the DBA's are not really sure about the effects of the script executions. With one of my clients we have suggested developers to use OUTPUT statement for INSERT, UPDATE and DELETE statements. They get logged into a table created with ticket number into a DB which will be cleaned periodically. There is little additional storage usage, however considering the business impact of "Oops, I deleted that data" i would take it.

With this process, i always make sure there is an OUTPUT statement for any data change statements.