RE: Just For Fun: An Impossible Delete

SSC Guru

Points: 1004703

August 5, 2008 at 5:37 am

Tao Klerks (8/5/2008)
Hmm, my curiousity is piqued... The solution presented is quite interesting, but I believe the purported interest in bits and bytes is a little exaggerated, as the solution is still quite expensive (rewrites an entire column in a single update statement, and rewrites/deletes the bulk of the data if there are many duplicates - needing up to almost double the original storage space to complete). So I'm quite curious:
What would the most efficient way of addressing this problem be? (using least memory and / or using least I/O, especially on a very large data set; also consider whether the data can still be accessed during the update process)
This should be quite easy to test, given a few hours to create a large (GB-sized?) data set and try out a few different approaches...
It seems a problem that many more people are likely to face nowadays - powerful hardware, very robust systems that can remain online as you do things like add columns, but extremely large amounts of data (many gigabytes) and no maintenance time / downtime available!

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)