Just For Fun: An Impossible Delete

  • Brings back many fond memories, or nightmares back then. Excellent article, which has the opportunity to not only show alternatives to how we do things now, but also provides opportunities to expand our knowledgebase with alternative examples. Never know when one may run into some of these opportunities.

    Steve Jimmo
    Sr DBA
    “If we ever forget that we are One Nation Under God, then we will be a Nation gone under." - Ronald Reagan

  • Thanks for the thought provoking article R, and 5 stars for a good read!

    I've used the Temp In-Place method in the past. Sometimes to fix my own mistakes. I never knew it had a name.

    An aside: Assuming tinyint for the age would be incorrect, since the primary solution in the article drove the column negative.

    Tom Garth
    Vertical Solutions[/url]

    "There are three kinds of men. The one that learns by reading. The few who learn by observation. The rest of them have to pee on the electric fence for themselves." -- Will Rogers
  • What a great thought experiment! :smooooth:

    It was fun to watch the problem solving unfold and the answer develop.

    These are the articles that keep me coming back.

    Thanks a ton for sharing;

    Greg

  • Actually, given that it's not a relational problem, a procedural solution is likely correct, unless there's another way that I'm mostly ignorant on (e.g. functional) to solve it .

    Now this was "just for fun"; which is fine. We all like a good "clever ugly" hack as brain candy!

    You didn't hold that (a) the data made sense or (b) that SQL was truly the best tool to solve it. Cool.

    Had it not been a puzzle whose constraints were simply to drive the thought process (rather than actually solve the stated problem quickly), I'd say:

    Given how often procedural programmers misuse SQL by using it like it was a set of commands for a COBOL tape merge, it pains me to see the case when a good "what not how" programmer goes along with forcing SQL into such unnatural contortions.

    SQL (even with its flaws) is for databases, not for reading sorted, line oriented files that some misguided soul who has mistaken an RDBMS for a file system. It's much rarer than the inverse, of course.

    If the data had not been stored in a RDBMS, we'd all have been thinking a command line pipe (Unix or PowerShell) or Perl, a quick C or VB hack to do what's actually a data cleansing job. So my solution is to ignore that it's been stored in SQL Server: bcp it out, pipe it through a 3 line Perl script, and bcp it in (or better, move the result to some appropriate appliance).

    It was an enjoyable read.

    Roger L Reid

  • I know this is an old article, and maybe someone already offered this solution, but to be honest the comment list is just too long for me to read it all...

    So I used your approach, but instead of the sex column, I used the age, which I assumed was an integer.

    The 'sequence' is multiplied by 1000 and added to the age, then we just reverse back at the end:

    insert into source(name, age, sex)

    select

    s.name,

    s.age + (t.number * 1000) as newage,

    s.sex

    from dbo.source s

    join tally t

    on t.number <= (

    select count(*)

    from source srccnt

    where srccnt.name = s.name

    and srccnt.name = s.name

    and srccnt.name = s.name

    )

    and t.number > 1

    group by

    s.name,

    s.age,

    s.sex

    ,t.number

    delete from source

    where age < 1000

    update source

    set age = age % 1000

  • Friday fun...:-)

    RBarryYoung (8/5/2008)


    Antares686 (8/5/2008)


    But my first impression was the first example for the contact numbers. I noticed you made 4 passes over the data instead of just the one. And even thou you option works the issue the customer faced was they had two seperate updates which caused the issue. If both had been handled in the same UPDATE they issue does not occurr and I would have done something like this

    BEGIN Transaction

    UPDATE CONTACT_NUMBERS

    SET CallOrder = (CASE WHEN CallOrder = 1 THEN 2 ELSE 1 END)

    COMMIT Transaction

    as this makes one pass across the data and adjust all records at the same time the conflict does not occurr.

    True, however, recall that I qualified this example in the article:

    And yes, I do know that there are other ways to do this correctly, especially with a CASE function. For reasons that I cannot get into, that was not an option here.

    I don' need no stinking CASE...:-P

    UPDATE CONTACT_NUMBERS

    SET CallOrder = CallOrder | 2 - 1

    Then again this might not have been right for that customer either...;-)

    Tom Garth
    Vertical Solutions[/url]

    "There are three kinds of men. The one that learns by reading. The few who learn by observation. The rest of them have to pee on the electric fence for themselves." -- Will Rogers
  • arms.dan (10/30/2009)


    Your off-hand comment about naming the field "Gender" rather than "Sex" piqued my curiosity. Why? As I understand the words, sex means biological differences: chromosomes and sexual organs. Gender refers to the characteristics a society or culture delineates as masculine or feminine. There's some blurring of the definitions, but I think they generally hold. So why do you prefer Gender for the column label?

    It is in the interest of calling things what they are. In a medical file, it's probably more appropriate to use 'sex' but in a demographics file, it's probably more informative to use 'gender' - if both of these items were in my medical files, they would have different answers. Since I purchase a lot of cosmetics and other products, someone looking at sales information would want to consider me female, which is a gender.

  • Jasmine D. Adamson (10/30/2009)


    arms.dan (10/30/2009)


    Your off-hand comment about naming the field "Gender" rather than "Sex" piqued my curiosity. Why? As I understand the words, sex means biological differences: chromosomes and sexual organs. Gender refers to the characteristics a society or culture delineates as masculine or feminine. There's some blurring of the definitions, but I think they generally hold. So why do you prefer Gender for the column label?

    It is in the interest of calling things what they are. In a medical file, it's probably more appropriate to use 'sex' but in a demographics file, it's probably more informative to use 'gender' - if both of these items were in my medical files, they would have different answers. Since I purchase a lot of cosmetics and other products, someone looking at sales information would want to consider me female, which is a gender.

    Jasmine,

    This may be a generational issue, but last time I checked, most sex-differentiation categories (i.e., Y vs X chromosome) give you the options of male or female. So I beg to differ whether female characterizes only a gender. (And I hope this does not engender much more debate - we are moving far afield from the original brain-teaser).

  • Interesting Read! Good article! and a clever use of an "old" method.

    Comment on the "The Temp In-Place Method". In those days we also worried about the machine cycles used, so for the example described in this part of the article, we would have limited the number of updates to 3: first, update '1' to '-2'; second, update '2' to '1'; and final pass to update '-2' to '1'.

  • (Oops!! correction at the end of the post in bold.)

    Interesting Read! Good article! and a clever use of an "old" method.

    Comment on the "The Temp In-Place Method". In those days we also worried about the machine cycles used, so for the example described in this part of the article, we would have limited the number of updates to 3: first, update '1' to '-2'; second, update '2' to '1'; and final pass to update '-2' to '2'.

  • RBarryYoung (8/5/2008)


    Antares686 (8/5/2008)


    But my first impression was the first example for the contact numbers. I noticed you made 4 passes over the data instead of just the one. And even thou you option works the issue the customer faced was they had two seperate updates which caused the issue. If both had been handled in the same UPDATE they issue does not occurr and I would have done something like this

    BEGIN Transaction

    UPDATE CONTACT_NUMBERS

    SET CallOrder = (CASE WHEN CallOrder = 1 THEN 2 ELSE 1 END)

    COMMIT Transaction

    as this makes one pass across the data and adjust all records at the same time the conflict does not occurr.

    True, however, recall that I qualified this example in the article:

    And yes, I do know that there are other ways to do this correctly, especially with a CASE function. For reasons that I cannot get into, that was not an option here.

    See, my old time itch is boolean algebra. This solution doesn't really use it, but the concept is the same.

    UPDATE CONTACT_NUMBERS

    SET CallOrder = -1*CallOrder + 3

    1 becomes -1 + 3 = 2

    2 becomes -2 + 3 = 1

    --

    JimFive

  • Excellent link - that will come in very handy.

    And makes my solution to the article much simpler, no need to add a fake ID or any preparation just one simple delete of the first or only record:

    DELETE a

    FROM Play a

    WHERE replace(%%LockRes%%,':','') =

    (SELECT min(replace(%%LockRes%%,':',''))

    FROM Play b

    WHERE a.TheName = b.TheName

    AND a.Identifier = b.Identifier

    AND a.sex = b.sex

    jghali (10/30/2009)


    Funny how this article just came out today... I was actually on a similar thread last week and someone came up with a great solution...

    My personal opinion the article is interesting but ... hmmm...

    How about these solutions from this thread...

    http://www.sqlservercentral.com/Forums/Topic793765-145-1.aspx

    There's an undocumented identity key for every row of any table that can be used... in one simple delete statement you can delete the duplicate rows...

    I was amazed...

    Check it out.

    Thanks

  • The original article states that you cannot use an identity column but says nothing about adding columns in general. Doesn't the entire issue come down to uniquely identifying each row? If so, why not simply add a guid column and fill it with newId()? The other solutions effectively do the same except into a table variable or by munging the data.

  • It says no additional identity column so that doesn't outrule the hidden one that already exists... As for adding a guid it depends on how strictly you say what is an identity column - is that simply a column with the identity property, therefor any other column like uniqueidentifier can be added?...!

    I love the way, once most or all possibilities are exhausted, we try and re-interpret the question, bend

    the rules...

    Thomas-282729 (10/30/2009)


    The original article states that you cannot use an identity column but says nothing about adding columns in general. Doesn't the entire issue come down to uniquely identifying each row? If so, why not simply add a guid column and fill it with newId()? The other solutions effectively do the same except into a table variable or by munging the data.

  • I wanted to come up with a solution of my own with reading just the statement of the challenge. So, here it is. I took me about 10-15 minutes to have the flash of inspiration. Now I will go back and read the article past the "Cursors? Foiled Again" section and the discussion thread.

    [font="Courier New"]

    create table stat (

    Name varchar(50), Age int, Sex char(1)

    );

    go

    set nocount on;

    insert stat values ('ABC', 24, 'M');

    insert stat values ('ABC', 24, 'M');

    insert stat values ('LMN', 27, 'M');

    insert stat values ('LMN', 27, 'M');

    insert stat values ('LMN', 27, 'M');

    insert stat values ('PQRS', 25, 'F');

    insert stat values ('XYZ', 24, 'M'); -- These are not dupli-

    insert stat values ('XYZ', 24, 'M'); -- cates in the article

    set nocount off;

    select * from stat order by Name;

    go

    set nocount on;

    declare @mo int, @i int;

    select @mo = MAX(s.Occurances)

    from (

    select Name, Age, Sex, COUNT(*) Occurances

    from stat

    where Age < 1000

    group by Name, Age, Sex

    ) s;

    set @i = 2;

    while (@i <= @mo) begin

    insert stat

    select Name, Age + 1000, Sex

    from stat

    where Age < 1000

    group by Name, Age, Sex

    having COUNT(*) >= @i;

    set @i = @i + 1;

    end

    delete stat where Age < 1000;

    update stat set Age = Age - 1000;

    set nocount off;

    go

    select * from stat order by Name;

    go

    drop table stat;

    go

    [/font]

    Sincerely,
    Daniel

Viewing 15 posts - 121 through 135 (of 156 total)

You must be logged in to reply to this topic. Login to reply