Update duplicate values to 0

  • Hi,

    I need some help in my query. Want to change numbers to 0 if the row is a duplicate records based on uniqueness of 2 columns

    Condition 1 ) ID and Market

    Condition 2 ) ID,PL

    For the Same ID and Market , I want to retain one row of original values in MarketComplete and MarketAssigned , rest should be 0

    For the Same ID and PL , I want to retain one row of original values in PLCompleted and PLAssigned , rest should be 0

    CREATE TABLE #Duplicate

    ( ID INT,

    Market VARCHAR(5),

    PL VARCHAR (5),

    MarketCompleted INT,

    MarketAssigned INT,

    PLCompleted INT,

    PLAssigned INT

    )

    INSERT INTO #Duplicate ( ID,Market,PL,MarketCompleted,MarketAssigned,PLCompleted,PLAssigned )

    SELECT 2991008, 'AM', 'RSS', 0, 13, 13, 0 UNION ALL

    SELECT 2991008, 'AM', 'CSS', 0, 13, 13, 0 UNION ALL

    SELECT 2991008, 'NAO', 'CSS', 0, 13, 13, 0 UNION ALL

    SELECT 2991008, 'SCA', 'CSS', 0, 13, 13, 0 UNION ALL

    SELECT 2892552, 'AM', 'RSS', 13, 13, 13, 13 UNION ALL

    SELECT 2892552, 'AM', 'CSS', 13, 13, 13, 13 UNION ALL

    SELECT 2113785, 'AM', 'ALS', 13, 13, 13, 13

    SELECT * FROm #Duplicate

     

    --Desired table :

    SELECT 2991008 AS ID, 'AM' AS Market, 'RSS' AS PL, 0 AS MarketCompleted, 13 AS MarketAssigned, 13 AS PLCompleted, 0 AS PLAssigned UNION ALL

    SELECT 2991008 AS ID, 'AM' AS Market, 'CSS' AS PL, 0 AS MarketCompleted, 0 AS MarketAssigned, 13 AS PLCompleted, 0 AS PLAssigned UNION ALL

    SELECT 2991008 AS ID, 'NAO'AS Market, 'CSS' AS PL, 0 AS MarketCompleted, 13 AS MarketAssigned, 0 AS PLCompleted, 0 AS PLAssigned UNION ALL

    SELECT 2991008 AS ID, 'SCA' AS Market, 'CSS' AS PL, 0 AS MarketCompleted, 13 AS MarketAssigned, 0 AS PLCompleted, 0 AS PLAssigned UNION ALL

    SELECT 2892552 AS ID, 'AM' AS Market, 'RSS' AS PL, 13 AS MarketCompleted, 13 AS MarketAssigned, 13 AS PLCompleted, 13 AS PLAssigned UNION ALL

    SELECT 2892552 AS ID, 'AM' AS Market, 'CSS' AS PL, 0 AS MarketCompleted, 0 AS MarketAssigned, 13 AS PLCompleted, 13 AS PLAssigned UNION ALL

    SELECT 2113785 AS ID, 'AM' AS Market, 'ALS' AS PL, 13 AS MarketCompleted, 13 AS MarketAssigned, 13 AS PLCompleted, 13 AS PLAssigned

    DROP TABLE #Duplicate

     

    Thanks,

    PSB

  • Jeez, 13150 points and you still haven't learned how to present your code in a code block.

    All you need to do is click on the Insert/edit code sample button and paste your code in there.

    CREATE TABLE #Duplicate
    (
    ID INT
    ,Market VARCHAR(5)
    ,PL VARCHAR(5)
    ,MarketCompleted INT
    ,MarketAssigned INT
    ,PLCompleted INT
    ,PLAssigned INT
    );

    INSERT INTO #Duplicate
    (
    ID
    ,Market
    ,PL
    ,MarketCompleted
    ,MarketAssigned
    ,PLCompleted
    ,PLAssigned
    )
    VALUES
    (2991008, 'AM', 'RSS', 0, 13, 13, 0)
    ,(2991008, 'AM', 'CSS', 0, 13, 13, 0)
    ,(2991008, 'NAO', 'CSS', 0, 13, 13, 0)
    ,(2991008, 'SCA', 'CSS', 0, 13, 13, 0)
    ,(2892552, 'AM', 'RSS', 13, 13, 13, 13)
    ,(2892552, 'AM', 'CSS', 13, 13, 13, 13)
    ,(2113785, 'AM', 'ALS', 13, 13, 13, 13);

    SELECT *
    FROM #Duplicate;

    --Desired table :
    SELECT *
    FROM
    (
    VALUES
    (2991008, 'AM', 'RSS', 0, 13, 13, 0)
    ,(2991008, 'AM', 'CSS', 0, 0, 13, 0)
    ,(2991008, 'NAO', 'CSS', 0, 13, 0, 0)
    ,(2991008, 'SCA', 'CSS', 0, 13, 0, 0)
    ,(2892552, 'AM', 'RSS', 13, 13, 13, 13)
    ,(2892552, 'AM', 'CSS', 0, 0, 13, 13)
    ,(2113785, 'AM', 'ALS', 13, 13, 13, 13)
    ) x (ID, Market, PL, MarketCompleted, MarketAssigned, PLCompleted, PLAssigned);

    DROP TABLE #Duplicate;

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • use the standard ROW_NUMBER() with a windowing function, and then remove any where the ROW_NUMBER() value is not 1?

  • Edit: Corrected typo, as pointed out by Mr. Brian Gale.

    UPDATE D
    SET MarketCompleted = 0, MarketAssigned = 0
    FROM (
    SELECT *, ROW_NUMBER() OVER(PARTITION BY ID, Market ORDER BY PL DESC) AS row_num
    FROM #Duplicate
    WHERE MarketCompleted <> 0 OR MarketAssigned <> 0
    ) AS D_first
    INNER JOIN #Duplicate D ON D.ID = D_First.ID AND D.Market = D_First.Market AND
    D.PL < D_First.PL
    WHERE D_first.row_num = 1

    UPDATE D
    SET PLCompleted = 0, PLAssigned = 0
    FROM (
    SELECT *, ROW_NUMBER() OVER(PARTITION BY ID, PL ORDER BY Market DESC) AS row_num
    FROM #Duplicate
    WHERE PLCompleted <> 0 OR PLAssigned <> 0
    ) AS D_first
    INNER JOIN #Duplicate D ON D.ID = D_First.ID AND D.PL = D_First.PL AND --<<--Corrected this line
    D.Market < D_First.Market
    WHERE D_first.row_num = 1

    SELECT 'Result', * FROM #Duplicate ORDER BY ID, Market, PL;

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • Thanks for the queries above. For the PL update, it's not working as expected .

     

     

    Attachments:
    You must be logged in to view attached files.
  • PSB wrote:

    Thanks for the queries above. For the PL update, it's not working as expected .

    OK, adjust it as you need to.  You didn't provide enough details to know which of the duplicates you wanted to be considered the "master" (controlling record), the one that doesn't get zeroed out.

     

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • Master controlling record is the ID . If there are duplicates for MARKET for same ID then keep one value for the Market and update the rest to 0.

    If there are duplicates for PL for same ID then keep one value for the PL and update the rest to 0.

     

  • I think the issue is Scott's query has a typo:

    INNER JOIN #Duplicate D ON D.ID = D_First.ID AND D.Market = D_First.PL AND

    should be:

    INNER JOIN #Duplicate D ON D.ID = D_First.ID AND D.PL = D_First.PL AND

    Comparing Market to PL will have no matches, so the second update doesn't update anything.

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • D'OH, quite right, a copy/paste mishap I think.  I have corrected the original code.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • >> I need some help with my query. Want to change numbers to 0 if the row is a duplicate record [sic: rows are not records] based on the uniqueness of 2 columns <<

    No, you need to get your DDL correct first. Did you know that the table must have a key by definition? Did you know that there's no such thing as a generic id in RDBMS? It must be the identifier of something in particular.

    >> For the Same ID and market, I want to retain one row of original values in market Complete and market Assigned, rest should be 0 <<

    This sounds horrible. What is the rule for determining which of the original values to retain? Is it depended on ordering? But wait, tables don't have an ordering! Being completed or assigned are status codes, not separate attributes. Please stop using the original SELECT..UNION table constructor. What is a “pl”? Why do you think this is immediately understood by anyone trying to read or maintain your code? Is it some industry-standard I do not know?

    Your sample code makes no sense to me and I've been at this over 30 years. You please post something useful?

    CREATE TABLE Markets

    (market _id CHAR(5) NOT NULL PRIMARY KEY,

    pl VARCHAR (5) NOT NULL, – define this attribute

    market_status CHAR(10) NOT NULL

    market_status IN (‘Completed’, ‘Assigned’),

    ..);

    Please post DDL and follow ANSI/ISO standards when asking for help. 

  • jcelko212 32090 wrote:

    You please post something useful?

    Heh... you've been at this for 30 years and you still can't meet your own demands. 😉  Drop the tough guy troll act and post something that will actually help the OP.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Just a personal habit of mine... I don't update such data because, if someone adds interceding data, you might not be able to determine what should happen to it because other data has been converted to a "0".

    I'm also concerned about the sort order.  I was able to replicate your desired output with the following code but it's highly dependent on the ascending order of the Market column and the descending order of the PL column (to Joe's point, whatever the hell a "PL" is).  That means that if someone comes up with (especially) an out of order PL that cannot follow a descending order logically, all of this will break.

    Here's the code to produce the desired output without changing the underlying table...

     SELECT  ID
    ,Market
    ,PL
    ,MarketCompleted = IIF(LAG(MarketCompleted,1,-1) OVER (PARTITION BY ID,Market ORDER BY Market) = MarketCompleted,0,MarketCompleted)
    ,MarketAssigned = IIF(LAG(MarketAssigned ,1,-1) OVER (PARTITION BY ID,Market ORDER BY Market) = MarketAssigned ,0,MarketAssigned)
    ,PLCompleted = IIF(LAG(PLCompleted ,1,-1) OVER (PARTITION BY ID,PL ORDER BY PL DESC) = PLCompleted ,0,PLCompleted)
    ,PLAssigned = IIF(LAG(PLAssigned ,1,-1) OVER (PARTITION BY ID,PL ORDER BY PL DESC) = PLAssigned ,0,PLAssigned)
    FROM #Duplicate
    ORDER BY ID DESC, Market, PL DESC
    ;

    To Phil's good point, PLEASE learn how to use the code blocks when posting.  When you're creating a new post, look at the small icon menu bar and find the following... which will create a code window for you.  Just paste your well formatted code into it.

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • When you set something to zero like this, you're changing the data,you're destroying the fact which Chris Date and I think most other RDBMS experts feel is a really bad.

    I'm quite sure that what this poster is trying to do is use SQL to format his data for display. This is considered heresy in any tiered architecture. I don't have enough time or space to post several of my books verbatim on a website for this guy. Look at his work; it really really is that bad! I'm not trolling him. I'm telling the truth.

    Please post DDL and follow ANSI/ISO standards when asking for help. 

  • jcelko212 32090 wrote:

    When you set something to zero like this, you're changing the data,you're destroying the fact which Chris Date and I think most other RDBMS experts feel is a really bad.

    I'm quite sure that what this poster is trying to do is use SQL to format his data for display. This is considered heresy in any tiered architecture. I don't have enough time or space to post several of my books verbatim on a website for this guy. Look at his work; it really really is that bad! I'm not trolling him. I'm telling the truth.

    Haha, this guy is always on here making a clown out of himself

    Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

  • jcelko212 32090 wrote:

    When you set something to zero like this, you're changing the data,you're destroying the fact which Chris Date and I think most other RDBMS experts feel is a really bad.

    I'm quite sure that what this poster is trying to do is use SQL to format his data for display. This is considered heresy in any tiered architecture. I don't have enough time or space to post several of my books verbatim on a website for this guy. Look at his work; it really really is that bad! I'm not trolling him. I'm telling the truth.

    Now that (the part about not updating the underlying data to meet reporting requirements), I absolutely and totally agree with.  That's why the code I posted didn't update any underlying data.  That's truly a "Data Sin".

    I also usually agree that using SQL to structure reporting is a Bozo-No-No but it's sometimes unavoidable.  That's why I made the comment I did.  Despite all that is holy in the world of T-SQL, you're sometimes presented with a problem where you sometimes have to bend the hell out of the rules without actually breaking them.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply