Merge

  • TomThomson

    SSC Guru

    Points: 104772

    Toreador (2/15/2013)


    sknox (2/14/2013)


    it relies on behavior specifically described by Microsoft as non-deterministic, and so should be considered dangerous.

    I wonder why Microsoft allow such code to parse as valid, given that they recognise that it should never be used?

    I don't think it's actually non-deterministic (given the contents of source and target the statement determines the result) and while MS reccomends not using it they don't, as far as I have seen, say it is non-deterministic. Probably the reason for MS allowing it is that the international SQL standard allows it (I could be wrong on that, haven't actually checked what the standard says).

    The reason to reccomend not using it is that conditions in the ON clause that are not specifically about whether a given pair of rows matches or not can have effects which, while perfectly deterministic and logical, are unexpected by most people (including me unless I think really hard about it) as for example where the target table starts empty, or where someone confuses being excluded from matching with being excluded from the action to be taken when there is no match for a given row - the latter being, I think, the real lesson of this QotD.

    Tom

  • Jason Wolfkill

    SSCrazy Eights

    Points: 9772

    davoscollective (2/14/2013)


    OK I see your point, but it would also be bad practice to outer join on a constant. I think it's another reason to keep join conditions separate to filter conditions.

    I too think it's generally a good idea to include only the join conditions in the ON clause and put the filter conditions in the WHERE/WHEN (NOT) MATCHED clauses. Even though SQL Server usually produces the same execution plan in both cases (although not with the MERGE statement, as demonstrated in this QotD!), I think it's much easier to understand how the results are returned when filter conditions aren't buried in the join conditions, especially for someone unfamiliar with the code.

    Jason Wolfkill

  • SQLRNNR

    SSC Guru

    Points: 281243

    Nice question. Thanks

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • Iulian -207023

    SSCertifiable

    Points: 7508

    Nice one, thanks

  • PChiragS

    SSCarpal Tunnel

    Points: 4965

    Great QoTD..

    Thank You.

  • db4breakfast

    Ten Centuries

    Points: 1097

    L' Eomot Inversé (2/14/2013)


    Nice question. However, the explanation would have been better if it had pointed out that filters like this can be placed in the when matche/not matched conditions. In this merge statement the when clause should have been when not matched by target and src.RowNo = 1 if the intention was to get only the three rows with RowNo 1.

    In fact even filtering of the target table in the on clause other than with a matching codition between source and target is stated by MS to be bad practice, as it can lead to unexpected results (ie it doesn't necessarily do what you would expect it to do).

    BoL


    It is important to specify only the columns from the target table that are used for matching purposes. That is, specify columns from the target table that are compared to the corresponding column of the source table.

    At first when reading the answer for this question, I thought it's about the "and" condition (if the first condition fails, don't bother to evaluate 2nd one). Reading the explanation on BOL helps me understand the topic better. good question overall.

    --------------------------------------------------------------------------------------
    Hai Ton
    My Db4Breakfast blog.

  • Hugo Kornelis

    SSC Guru

    Points: 64675

    Coming into this a bit late....

    Another (better, I think) way to explain this, is to think of the USING ... ON as a FULL OUTER JOIN. After performing that join, all matching rows between the two tables are passed to the WHEN MATCHED clause(s), rows from the source that are not in the destination are passed to WHEN MATCHED BY TARGET, and rows in the destination that are not in the source are passed to WHEN MATCHED BY SOURCE. The additional "AND ..." condition on those clauses can then be used to filter out rows that need no action. However, I would personally strongly recommend filtering the source table in the USING itself, by using a derived table here. (Eg USING (SELECT ... FROM CTE WHERE RowNo = 1) AS src)

    I should have gotten this question right, but I didn't. Good question, helped me reinforce something that is apparently not as "top of mind" for me asa it should be.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • TomThomson

    SSC Guru

    Points: 104772

    Hugo Kornelis (2/28/2013)


    Coming into this a bit late....

    Another (better, I think) way to explain this, is to think of the USING ... ON as a FULL OUTER JOIN. After performing that join, all matching rows between the two tables are passed to the WHEN MATCHED clause(s), rows from the source that are not in the destination are passed to WHEN MATCHED BY TARGET, and rows in the destination that are not in the source are passed to WHEN MATCHED BY SOURCE. The additional "AND ..." condition on those clauses can then be used to filter out rows that need no action. However, I would personally strongly recommend filtering the source table in the USING itself, by using a derived table here. (Eg USING (SELECT ... FROM CTE WHERE RowNo = 1) AS src)

    That works fine when there is no "when not matched by source"; but filtering source in the using statement will add extra rows to be acted on by the not matched by source action, so it doesn't work when that clause exists. The only place to filter that is always both safe and also easy to understand is in the three (or how ever many there are in the particular statement) when clauses.

    Tom

  • Hugo Kornelis

    SSC Guru

    Points: 64675

    L' Eomot Inversé (2/28/2013)


    Hugo Kornelis (2/28/2013)


    Coming into this a bit late....

    Another (better, I think) way to explain this, is to think of the USING ... ON as a FULL OUTER JOIN. After performing that join, all matching rows between the two tables are passed to the WHEN MATCHED clause(s), rows from the source that are not in the destination are passed to WHEN MATCHED BY TARGET, and rows in the destination that are not in the source are passed to WHEN MATCHED BY SOURCE. The additional "AND ..." condition on those clauses can then be used to filter out rows that need no action. However, I would personally strongly recommend filtering the source table in the USING itself, by using a derived table here. (Eg USING (SELECT ... FROM CTE WHERE RowNo = 1) AS src)

    That works fine when there is no "when not matched by source"; but filtering source in the using statement will add extra rows to be acted on by the not matched by source action, so it doesn't work when that clause exists. The only place to filter that is always both safe and also easy to understand is in the three (or how ever many there are in the particular statement) when clauses.

    I guess I should have put it more clearly. "If you want to exclude rows from the source from the MERGE action, I recommend using a derived table in the USING clause, with a WHERE clause with the appropriate filter".


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • TomThomson

    SSC Guru

    Points: 104772

    Hugo Kornelis (2/28/2013)


    I guess I should have put it more clearly. "If you want to exclude rows from the source from the MERGE action, I recommend using a derived table in the USING clause, with a WHERE clause with the appropriate filter".

    With that clearer satement you've produced something I strongly agree with, because I think you mean something like "if you want to have a subset of some table as the source, do it the obvious and simple way, which is to use a derived table in the USING clause" and I always believe in doing things the obvious and simple way unless there are compelling reasons not to (eg when writing code for an obfuscation competition). Maybe I should have read your first comment as suggesting that, but I didn't: instead I saw it as suggesting this as a general alternative to doing excluding source rows in the when clauses, rather than applying only in the special case of wanting a different source (and that rather surprised me).

    Tom

Viewing 10 posts - 31 through 40 (of 40 total)

You must be logged in to reply to this topic. Login to reply