remove duplicate rows

  • I tried to remove some duplicate rows which have been at the root of several problems I have experienced today, but once I identified some dupes, I wasn't allowed to remove them.

    This table only has 4 fields, and has no PK or FK restraints defined. (It was put together by someone who was here before me and doesn't work here anymore)

    When I try to delete a row, just to see if I can, through the GUI (SSMS), it won't let me. It has a pop-up which says:

    The row value either updated or deleted either do not make the row unique or they alter multiple rows (2 rows).

    This occurs if I try to delete a row at a time or select several.

    I don't even understand that message.

  • I believe you are getting this error because there is no PK or Unique Index on the table so SSMS can't identify what to delete.

    Behind the scenes SSMS is running a statement like this:

    exec sp_executesql N'DELETE FROM test1 WHERE (Id = @Param1) AND (test_name = @Param2)',N'@Param1 int,@Param2 nvarchar(5)',@Param1=1,@Param2=N'Test1'

    So it first checks to make sure that it is only deleting 1 row, and if it isn't it throws the error.

    Here's an example:

    Run this and then try to delete one of the id 0 rows and you'll get the error:

    CREATE TABLE test1

    (

    id INT,

    test_name VARCHAR(10)

    )

    INSERT INTO test1 (

    id,

    test_name

    )

    SELECT

    0,

    'Test'

    UNION ALL

    SELECT

    0,

    'Test'

    UNION ALL

    SELECT

    1,

    'Test1'

    Then run this to actually delete the dupes:

    ;WITH cteDupes AS

    (

    SELECT

    ROW_NUMBER() OVER(PARTITION BY Id, test_Name ORDER BY id, test_name) AS row_id,

    id,

    test_name

    FROM

    dbo.test1

    )

    Delete FROM cteDupes WHERE row_id > 1

    And then run this to see that the row has been deleted:

    SELECT * FROM test1 AS T

  • Heh... and the read up on proper table design so you can tell the folks who designed that table how to fix it. 😛

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Sadly, the people responsible for this are no longer here. I'm a web developer getting a crash course to become a DBA 😉

    thanks. I think I can work with this. More important, I understand the explanation.

  • OK, I just checked out the code, and it appears that while the snippet posted above will enable me to remove dupes from the table in that sample, I couldn't use it to remove dupes from my table. Should I just add an ID column? I think I can do that without affecting anything.

  • Are you running SQL Server 2005? That statement should work on 2005/2008 against any table. Can you post the actual table structure?

  • I didn't run it because by reading it, it appeared that it required an ID column or PK.

    Anyway it's on SQL 2005; here's the create statement:

    USE [RICCA]

    GO

    /****** Object: Table [dbo].[COFAHEAD] Script Date: 07/20/2009 09:52:34 ******/

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    SET ANSI_PADDING ON

    GO

    CREATE TABLE [dbo].[COFAHEAD](

    [PartNo] [varchar](50) NOT NULL,

    [Header] [varchar](2000) NULL,

    [Footer] [varchar](2000) NULL,

    [version] [numeric](18, 0) NOT NULL,

    [storageCode] [char](2) NULL

    ) ON [PRIMARY]

    GO

    SET ANSI_PADDING OFF

  • OK, I tried running it. Here's the altered code:

    =============

    WITH cteDupes AS

    (

    SELECT

    ROW_NUMBER() OVER(PARTITION BY PartNo, version ORDER BY PartNo, version) AS row_id,

    PartNo,

    version

    FROM

    dbo.COFAHEAD

    )

    Delete FROM cteDupes WHERE row_id > 1

    ==============

    and here's the error message:

    Msg 156, Level 15, State 1, Line 1

    Incorrect syntax near the keyword 'WITH'.

    Msg 195, Level 15, State 10, Line 4

    'ROW_NUMBER' is not a recognized function name.

  • I chose those two fields (PartNo and Version) because they are, effectively, the PKs. That is, they aren't defined as such, but those are the two fields together I want to not be duplicated. I will be glad to actually mark them as PK when I am sure it won't hurt anything.

  • You have to put a ";" before the WITH. It's a little annoying, but it is required. You either have to end the prvious statement before the cte with a ";" or you need to precede the WITH with a ";".

    In my example you can see that I did use ;WITH.

  • I also ran it with the semicolon, got a similar error.

    To be clear, this is what I have in the query window in SSMS. Nothing else. Just the text between the lines:

    =================

    ; WITH cteDupes AS

    (

    SELECT

    ROW_NUMBER() OVER(PARTITION BY PartNo, version ORDER BY PartNo, version) AS row_id,

    PartNo,

    version

    FROM

    dbo.COFAHEAD

    )

    Delete FROM cteDupes WHERE row_id > 1

    =================

    and I get this:

    Msg 170, Level 15, State 1, Line 1

    Line 1: Incorrect syntax near ';'.

    Msg 195, Level 15, State 10, Line 4

    'ROW_NUMBER' is not a recognized function name.

  • Please run this against the same database and post on the results:

    SELECT @@VERSION AS version,

    SERVERPROPERTY('ProductVersion') AS version1,

    SERVERPROPERTY('ProductLevel') AS productlevel,

    [name],

    compatibility_level

    FROM

    sys.databases AS D

    WHERE

    D.database_id = DB_ID()

  • Oh geez. I am dealing with two database servers here. One is Win2000/SQL2000, the other being Win2003/SQL2005. I'm moving from one to the other. In fact, that's the point of this exercise.

    You have correctly pointed out that I was running the query against the 2000 DB. My apologies.

  • OK, I just ran it against the 2005 version of this thing. It removed the expected number of rows. Now I'll check it out in detail.

    thanks very much

  • Hey no problem. I'd probably do the same thing if I was running everything in SSMS 2005. This exercise is quite a bit more difficult in 2000 than 2005. AS a matter of fact in 2000 I'd probably load the data into a table with an id column, do the delete, truncate the original table, and then reload it from the other new table.

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply