Just For Fun: An Impossible Delete

Question

Just For Fun: An Impossible Delete

Viewing 15 posts - 16 through 30 (of 156 total)

You must be logged in to reply to this topic. Login to reply

RBarryYoung SSC Guru Points: 143329 More actions · Answer 1

Jeff Moden (8/4/2008)
What an amazing article! Covers most everything from womb-to-tomb for such an "impossible" delete. I remember being amazed when Barry first posted his solution on the original thread. I'm even more amazed that he wove in some of the techniques used in the old days and took the time to explain how each byte of memory was cherished which made these types of methods absolutely necessary.
Well done, Barry!

Thanks, Jeff. High praise from the master himself. :blush:

[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung [font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]

RBarryYoung SSC Guru Points: 143329 More actions · Answer 2

Emre Erkan (8/5/2008)
Nice Article. I learned a lot. But for this problem there may be a shorter/ simpler solution...

In all fairness, I should point out that sometime after I posted my solution, Chris Morris posted what I believe to be the optimal solution. It is far shorter and simpler than mine, with none of my convoluted 30 year-old manipulations: http://www.sqlservercentral.com/Forums/FindPost526653.aspx. Of course, I couldn't very well write an articale about that!

[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung [font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]

Phil Factor SSC-Insane Points: 20244 More actions · Answer 3

Yes. Shaun

The Sybase and SQL Server Update command allows the use of variables in the SET statement. In the case of SQL Server 2005, this is....

SET

{ column_name = { expression | DEFAULT | NULL }

| { udt_column_name.{ { property_name = expression

| field_name = expression }

| method_name ( argument [ ,...n ] )

}

| column_name { .WRITE ( expression , @Offset , @Length ) }

| @variable = expression

| @variable = column = expression [ ,...n ]

} [ ,...n ]

All I'm doing is talking advantage of the order of update in order to flag each row that is different from the one before (the table must be in the physical order of the columns that you are comparing). I'm just writing a 'd' (meaning delete) in every row that is different to the one before and then deleting each row that has that 'd'. I used the hash just to make the code more compact, but the logic should have been expanded to make it more readable. I should have saved each significant column to a separate variable, and compared them separately with an AND.

Note a couple of things.

Updates seem to be done in the order of the clustered index. In the example, it was done in the order we inserted them into the table variable. with each row, all updates that assign to a variable are done first, in order left to right, followed by the column assignments, left to right. You have to be careful, though as there is no definition of this behaviour in BOL as far as I can find, but it seems to work in all versions.

It is very fast. I am confident I can win the speed race against the other solutions as long as the table is in the correct physical order!

Best wishes,
Phil Factor

RBarryYoung SSC Guru Points: 143329 More actions · Answer 4

Chirag (8/5/2008)
great article!!

Thanks for the Feedback, Chriag!

[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung [font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]

Phil Factor SSC-Insane Points: 20244 More actions · Answer 5

Oops

I've just seen Chris Morris solution in the original forum, which is pretty much the same as mine. He got there first. I'll shut up!

Best wishes,
Phil Factor

RBarryYoung SSC Guru Points: 143329 More actions · Answer 6

Atif Sheikh (8/5/2008)
GREAT ARTICLE...
Alot of learning points in it...

Thanks, Atif.

[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung [font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]

Shaun McGuile SSCarpal Tunnel Points: 4141 More actions · Answer 7

Thanks Phil

love the eBook and the SimpleTalk articles, always a good read.:)

The one about the ex-riot policemane turned DBA should have been quoted on the Stored Procs vs ORM thread. 😀

Hiding under a desk from SSIS Implemenation Work :crazy:

RBarryYoung SSC Guru Points: 143329 More actions · Answer 8

Tao Klerks (8/5/2008)
Hmm, my curiousity is piqued... The solution presented is quite interesting, but I believe the purported interest in bits and bytes is a little exaggerated, as the solution is still quite expensive (rewrites an entire column in a single update statement, and rewrites/deletes the bulk of the data if there are many duplicates - needing up to almost double the original storage space to complete).

No argument, Tao. Understand that up until that point, most of us, including myself at first, felt that it was not even possible given the restrictions. My solution is certainly not the most efficient, just the first one to "crack the nut" so to speak.

So I'm quite curious:
What would the most efficient way of addressing this problem be? (using least memory and / or using least I/O, especially on a very large data set; also consider whether the data can still be accessed during the update process)

Great question. My nomination would be for Chris Morris's entry, here: http://www.sqlservercentral.com/Forums/FindPost526653.aspx

[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung [font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]

RBarryYoung SSC Guru Points: 143329 More actions · Answer 9

Samuel Vella (8/5/2008)
I think this calls for Monty Pythons The Four Yorkshiremen skit:
http://www.phespirit.info/montypython/four_yorkshiremen.htm

Heh. Thanks, Samuel, that's one of my favorite. Some of the other fossils and I do that bit for the "kids" we work with every now and then:

"You had bytes? You were lucky! It was a good day for us if we even smelled a bit!" 😛

[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung [font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]

Jan Van der Eecken SSCrazy Eights Points: 9029 More actions · Answer 10

Brilliant article! Reminds me somehow of one of my colleagues who wrote the following in C:

int x = 1;

int y = 2;

x ^= y ^= x ^= y;

just to swap two integers around without wasting another 16 bits (or was it even 8 back then) on the stack.

--------------------------------------------------------------------------
A little knowledge is a dangerous thing (Alexander Pope)
In order for us to help you as efficiently as possible, please read this before posting (courtesy of Jeff Moden)[/url]

RBarryYoung SSC Guru Points: 143329 More actions · Answer 11

Phil Factor (8/5/2008)
Surely this is the simplest solution? ...

Pretty good, Phil. I think that that is almost the same as Chris Morris's.

[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung [font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]

Tao Klerks SSCarpal Tunnel Points: 4297 More actions · Answer 12

OK, so now I'm in awe of Chris Morris too... 🙂

http://poorsql.com for T-SQL formatting: free as in speech, free as in beer, free to run in SSMS or on your version control server - free however you want it.

RBarryYoung SSC Guru Points: 143329 More actions · Answer 13

JohnG (8/5/2008)
This was a great article and I really enjoyed reading it. I'm particularly grateful for the excellent explanatory text which meant I could actually follow most of what was going on despite being fairly new to SQL.

At the risk of being overly picky there is one bit I have to query:
"Likewise, the F's are stored as "01000110" in upper-case and "01000110" in lower (if you look closely, you can tell which bit makes it upper or lower case)."
Am I simply not looking closely enough or are these values actually identical? Sneaky typo perhaps?

Ouch! Yeah, you got me. (Dang, how did that one slip past?) I beleive that the lower-case "f" should be binary "01100110".

[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung [font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]

RBarryYoung SSC Guru Points: 143329 More actions · Answer 14

Jan Van der Eecken (8/5/2008)
Brilliant article! Reminds me somehow of one of my colleagues who wrote the following in C:
int x = 1;
int y = 2;
x ^= y ^= x ^= y;
just to swap two integers around without wasting another 16 bits (or was it even 8 back then) on the stack.

Thanks, Jan. Yeah, I know that old chestnut. In fact I've been trying to figure out how to work it into a SQL QotD for the last 6 weeks. And now you've gone and given it away! 🙂

[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung [font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]

bob.holbrey Grasshopper Points: 11 More actions · Answer 15

A really simple option which is able to take advantage that "age" is a number...

Create a table "test" with the following data as shown:

select * from test

NameAgeSex

ABC 24M

DEF 24M

GHI 26F

PQRS 25F

insert test

select distinct name, age*1000, sex

from test

delete from test where age<1000

update test

set age=age/1000

select * from test

NameAgeSex

ABC 24M

DEF 24M

GHI 26F

PQRS 25F