James Lavery (8/5/2016)
Hi Steve,Thanks - that's what I thought.
The problem is the customer wants to put representative data in terms of volumes on the new system, but wants it anonymised. If it was just test/dev data (i.e. representative in terms of structure and content but not volume), then all your very good suggestions are relevant.
As you say - there's no easy way. If the customer wants his very large database to be anonymised, he's going to have to wait for it to chug through!
No... You don't have to wait. Every system has a "tipping point" where things suddenly take forever but operate very quickly if you don't exceed the tipping point. It's not easy to determine what the tipping point is, either.
An example of the tipping point is that it may take 2 seconds to update 1 million rows, a very predictable 4 seconds to update 2 million rows, 6 seconds to update 3 million rows, and 6 HOURS to update 4 million rows.
You would likely be able to do the updates on your tables 2 or 3 million rows at a time using a well written loop to identify the rows to update in chunks.
--Jeff Moden
Change is inevitable... Change for the better is not.