Update Based on Select with Multiple Returns

  • Good Morning!

    I'm trying to update a column based on a WHERE statement that uses a SELECT statement as the qualifier.

    My goal is to find duplicate records in my table, then have a value in one of the columns change so that the users can search for and view these duplicate records and decide how they'd like to deal with them.

    Here's the SELECT statement that does a nice job of finding my duplicates:

    select IDX_Invoice_Number, IDX_Vendor_Number, status from _obj_3

    group by IDX_Invoice_Number, IDX_Vendor_number, status

    having COUNT (*)>1

    Here's what I'd use to accomplish my goal, if it were possible:

    update _obj_3

    set status='7'

    where idx_invoice_number in

    (

    select IDX_Invoice_Number, IDX_Vendor_Number, status from _obj_3

    group by IDX_Invoice_Number, IDX_Vendor_number, status

    having COUNT (*)>1

    )

    But, I get an error because the SELECT statement yeilds multiple returns:

    'Only one expression can be specified in the select list when the subquery is not introduced with EXISTS.'

    Which it will, because they're duplicates that it's finding.

    I've tried using an INNER JOIN back to itself with no luck, and just created a VIEW using my SELECT statement, hoping to be able to update the VIEW.

    No luck there, as the VIEW won't allow me to update it.

    Well, thanks taking a look, and I can't wait to see what comes back.

    Thanks!

  • cmcfarland 51322 (10/8/2013)


    Good Morning!

    I'm trying to update a column based on a WHERE statement that uses a SELECT statement as the qualifier.

    My goal is to find duplicate records in my table, then have a value in one of the columns change so that the users can search for and view these duplicate records and decide how they'd like to deal with them.

    Here's the SELECT statement that does a nice job of finding my duplicates:

    select IDX_Invoice_Number, IDX_Vendor_Number, status from _obj_3

    group by IDX_Invoice_Number, IDX_Vendor_number, status

    having COUNT (*)>1

    Here's what I'd use to accomplish my goal, if it were possible:

    update _obj_3

    set status='7'

    where

    (

    select IDX_Invoice_Number, IDX_Vendor_Number, status from _obj_3

    group by IDX_Invoice_Number, IDX_Vendor_number, status

    having COUNT (*)>1

    )

    But, I get an error because the SELECT statement yeilds multiple returns.

    Which it will, because they're duplicates that it's finding.

    I've tried using an INNER JOIN back to itself with no luck, and just created a VIEW using my SELECT statement, hoping to be able to update the VIEW.

    No luck there, as the VIEW won't allow me to update it.

    Well, thanks taking a look, and I can't wait to see what comes back.

    Thanks!

    You were getting an error (you should always post the error message btw) because you have nothing on one side of your condition. You can't just say Where (SubQuery).

    Try this:

    update _obj_3

    set status='7'

    where IDX_Invoice_Number in

    (

    select IDX_Invoice_Number from _obj_3

    group by IDX_Invoice_Number, IDX_Vendor_number, status

    having COUNT (*)>1

    )

    You might also refer to today's article by Stefan. It deals with finding and then deleting duplicates.

    http://www.sqlservercentral.com/articles/duplicate+data/102383/[/url]

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Hey, thanks for taking a look.

    I added the error, and amended my query in the post, as I'd actually written it with the 'in' clause to start with but messed it up when I brought it into my post.

    Thanks!

  • Since you are comparing 3 columns and it doesn't appear to have any different about them you could use a function like binary_checksum to create a single value for the duplicates. Like so:

    create table Obj3 (InvoiceNumber int, VendorNumber int, StatusID int,);

    go

    insert into Obj3

    values (1,1,1)

    , (1,1,1)

    , (2,2,2)

    , (2,2,2)

    , (3,3,3)

    select InvoiceNumber, VendorNumber, StatusID

    from Obj3

    group by InvoiceNumber, VendorNumber, StatusID

    having count(*) > 1

    update Obj3

    set StatusID = 7

    where BINARY_CHECKSUM(InvoiceNumber, VendorNumber, StatusID) in

    (select BINARY_CHECKSUM(InvoiceNumber, VendorNumber, StatusID)

    from Obj3

    group by InvoiceNumber, VendorNumber, StatusID

    having count(*) > 1)

    select InvoiceNumber, VendorNumber, StatusID

    from Obj3

    This should get you what you are looking for.



    Microsoft Certified Master - SQL Server 2008
    Follow me on twitter: @keith_tate

    Forum Etiquette: How to post data/code on a forum to get the best help[/url]

  • OK, it seems to work great!

    I added some limiters to narrow it down to a single record that I knew should be modified by the query, and it did the trick.

    Seems like it might just take some time to go through all of the rows.

    I'll update the post once the run is complete.

    Thanks!

  • For this type of problem I like the Windows functions to give each row a number which makes the row unique.

    With CTE the 'double' rows can be removed.

    CREATE TABLE dbo.SB_Dupli

    (

    Tekst varchar(300) NULL,

    Number int NULL

    ) ON [PRIMARY]

    insert into SB_Dupli values ('aaaa', 1)

    insert into SB_Dupli values ('aaaa', 6)

    insert into SB_Dupli values ('aaaa', 1)

    insert into SB_Dupli values ('bbbb', 2)

    insert into SB_Dupli values ('bbbb', 2)

    insert into SB_Dupli values ('bbbb', 2)

    insert into SB_Dupli values ('cccc', 3)

    insert into SB_Dupli values ('dddd', 4)

    insert into SB_Dupli values ('dddd', 5)

    insert into SB_Dupli values ('dddd', 4)

    -- show all rows.

    select * from sb_dupli

    -- show all distinct rows.

    select distinct * from sb_dupli

    -- show all rows with duplicates

    select Tekst, number, COUNT(*) multiple_aantal from SB_Dupli

    group by Tekst, number having COUNT(*) > 1

    -- give the duplicates a 'number'

    -- remove the double rows.

    --------------------------------------------------------

    -- Actual code to delete

    --------------------------------------------------------

    ;with

    A as (

    select

    row_number() OVER(PARTITION BY tekst, number ORDER BY (SELECT NULL)) volgnr

    from sb_dupli

    )

    delete A where volgnr > 1

    --------------------------------------------------------

    -- End Actual code to delete

    --------------------------------------------------------

    -- show the table

    select * from sb_dupli

    -- drop the example table

    drop table SB_Dupli

    Variations. (Replace the row_number line with)

    -- Duplicate texts are removed.

    row_number() OVER(PARTITION BY tekst ORDER BY (SELECT NULL)) volgnr

    -- Duplicate texts are removed Highest number is left.

    row_number() OVER(PARTITION BY tekst ORDER BY number DESC) volgnr

    -- Duplicate texts are removed Lowest number is left.

    row_number() OVER(PARTITION BY tekst ORDER BY number) volgnr

    -- Duplicate number are removed.

    row_number() OVER(PARTITION BY number ORDER BY (SELECT NULL)) volgnr

    With this set of code you can remove duplicates where the complete row is duplicated or a number of fields is duplicated.

    The partition clause should contain all the fields to determine which are duplicates, first row in the order by is the row which is kept.

    Ben

  • --------------------------------------------------------

    -- Actual code to UPDATE

    --------------------------------------------------------

    ;with

    A as (

    select

    number,

    row_number() OVER(PARTITION BY tekst, number ORDER BY (SELECT NULL)) volgnr

    from sb_dupli

    )

    update A set number=number*100+volgnr where volgnr > 1

    --------------------------------------------------------

    -- End Actual code to UPDATE

    --------------------------------------------------------

    Rereading the question, you wanted an update, above an example of an update. Offcourse you have to come up with your own update strategy.

    If the range is not know you could go for negative numbers for example:

    set number = - (volgnr*100 +number)

    This assumes you do not have more than 99 doubles.

    Ben

  • Thanks!

    This is my first brush with CET functions, and I'm sure I'll be using this more as time goes on.

  • Thanks.

    That's closer to what I'm after this time around, but your previous post is also quite interesting to me.

    I think I may have a large number of duplicates, so I guess I could probably just increase the number until the query is acting according to my wishes.

    Does that sound right?

  • Use the following code to just see what is happening:

    --------------------------------------------------------

    -- Actual code to prepare for an UPDATE a Select

    --------------------------------------------------------

    ;with

    A as (

    select

    number,

    row_number() OVER(PARTITION BY tekst, number ORDER BY (SELECT NULL)) volgnr

    from sb_dupli

    )

    select Number*100+volgnr as New_Num, * where volgnr > 1

    --------------------------------------------------------

    -- End Actual code to prepare for an UPDATE a Select

    --------------------------------------------------------

    With the CTE functions it is always very easy to substitute in the last part with a SELECT, UPDATE or DELETE statement.

    The above shows alle the doubles and not the 'first' row for each combination.

    But you could also specify volgnr < 4, showing only for doubles for all situations or (volgnr > 1 and volgnr<4).

    Ben

  • Thanks! I'll give that a try shortly!

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply