what is the difference between these two queries

Question

Post reply

what is the difference between these two queries

jaango123

Old Hand

Points: 353

February 6, 2025 at 7:54 pm

#4531539

Please I am struggling to understand the difference between these two queries. It behaves differently as output

google bigquery query 1

UPDATE dde-demo-d001.sap_crm.document_flow_root_bods
    SET case_guid = ICT.case_guid
    FROM dde-demo-d001.sap_crm.document_flow_root_bods DFR
    INNER JOIN dde-demo-d001.sap_crm.inferred_case_transactions ICT
    ON DFR.transaction_header_guid = ICT.transaction_header_guid
    WHERE DFR.case_guid IS NULL;

query 2

UPDATE dde-demo-d001.sap_crm.document_flow_root_bods DFR
    SET case_guid = ICT.case_guid
    FROM (SELECT transaction_header_guid,case_guid FROM dde-demo-d001.sap_crm.inferred_case_transactions) ICT
    WHERE (DFR.case_guid IS NULL) and (DFR.transaction_header_guid = ICT.transaction_header_guid);

Context : First query uses inner join and second doesnt use any joins. I cant seem to find any difference between two queries as far as logic ic concerned.

the query with inner join gives the below error

UPDATE/MERGE must match at most one source row for each target row

where as the query 2 updates success with 5 rows.

Viewing 5 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply

Site Owners SSC Guru Points: 80411 More actions · Answer 1

Thanks for posting your issue and hopefully someone will answer soon.

This is an automated bump to increase visibility of your question.

Jeff Moden SSC Guru Points: 1004685 More actions · Answer 2

The "JOIN" in the second query is in the WHERE clause as...

and (DFR.transaction_header_guid = ICT.transaction_header_guid)

It's also referred to as an "old style" or "Non-SANSI" "Equi-Join"

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jonathan AC Roberts SSCoach Points: 18408 More actions · Answer 3

It looks like there is more than one row in ICT with the same transaction_header_guid.

In SQL Server you could correctly write it as this:

UPDATE DFR
   SET DFR.case_guid = ICT.case_guid
  FROM dde_demo_d001.sap_crm.document_flow_root_bods AS DFR
 CROSS APPLY (SELECT TOP(1) ICT.case_guid 
                FROM dde_demo_d001.sap_crm.inferred_case_transactions AS ICT 
               WHERE DFR.transaction_header_guid = ICT.transaction_header_guid
               ORDER BY ICT.case_guid -- Ensure deterministic results if necessary
             ) AS ICT
 WHERE DFR.case_guid IS NULL;

In BigQuery like this:

UPDATE dde_demo_d001.sap_crm.document_flow_root_bods AS DFR
   SET DFR.case_guid = sub.case_guid
  FROM (
        -- Get exactly one "best" row per transaction_header_guid
        SELECT transaction_header_guid, case_guid
          FROM (SELECT transaction_header_guid,
                       case_guid,
                       ROW_NUMBER() OVER (PARTITION BY transaction_header_guid ORDER BY case_guid) AS rn
                  FROM dde_demo_d001.sap_crm.inferred_case_transactions
               )
         WHERE rn = 1
       ) AS sub
 WHERE DFR.case_guid IS NULL
   AND DFR.transaction_header_guid = sub.transaction_header_guid;

This reply was modified 10 months, 3 weeks ago by Jonathan AC Roberts. Reason: write a BigQuery

jens t Newbie Points: 3 More actions · Answer 4

Hey, great question — this actually tripped me up before too. The main reason you're seeing different behavior is because of how BigQuery handles updates when there's more than one match per row.

In your first query with the INNER JOIN, if there's even one record in inferred_case_transactions that matches the same transaction_header_guid more than once, BigQuery throws that error — it doesn’t know which case_guid to use for the update. That’s what the error means when it says “must match at most one source row for each target row.” Even if you think it should be one-to-one, if there are hidden duplicates, it’ll fail.

But in your second query, the subquery behaves differently — BigQuery doesn’t treat it as a join in the same way, so as long as it finds a single match during execution, it just works. That’s why you're seeing it update 5 rows successfully.

You might want to check if your inferred_case_transactions table has duplicate transaction_header_guid values. If so, and you want to use the join approach, you'd need to pick just one row per match — like with ROW_NUMBER() and filtering to the first one. Let me know if you want help with that!