SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Using FULL JOINs to Compare Datasets


Using FULL JOINs to Compare Datasets

Author
Message
rchantler
rchantler
SSC-Addicted
SSC-Addicted (439 reputation)SSC-Addicted (439 reputation)SSC-Addicted (439 reputation)SSC-Addicted (439 reputation)SSC-Addicted (439 reputation)SSC-Addicted (439 reputation)SSC-Addicted (439 reputation)SSC-Addicted (439 reputation)

Group: General Forum Members
Points: 439 Visits: 611
Good article, but are you actually creating random numbers in your example, using NewID()? Random and pseudo-random numbers have properties that this technique may not create.
markoos1
markoos1
SSC-Enthusiastic
SSC-Enthusiastic (141 reputation)SSC-Enthusiastic (141 reputation)SSC-Enthusiastic (141 reputation)SSC-Enthusiastic (141 reputation)SSC-Enthusiastic (141 reputation)SSC-Enthusiastic (141 reputation)SSC-Enthusiastic (141 reputation)SSC-Enthusiastic (141 reputation)

Group: General Forum Members
Points: 141 Visits: 249
I'm just using the NEWID() function as a way to create unique random numbers for each row. When you use RAND() in a query like this, it generates the same number for each row.

This is just for illustration in this example. In real situations, the numeric measures come from the source and destination queries, and aren't random numbers.
le_billy
le_billy
Valued Member
Valued Member (62 reputation)Valued Member (62 reputation)Valued Member (62 reputation)Valued Member (62 reputation)Valued Member (62 reputation)Valued Member (62 reputation)Valued Member (62 reputation)Valued Member (62 reputation)

Group: General Forum Members
Points: 62 Visits: 237
Thank you for the great article, I am seeing some work cut out for me to improve my previously written query :-D

Billy
Hardy21
Hardy21
SSCrazy
SSCrazy (2.7K reputation)SSCrazy (2.7K reputation)SSCrazy (2.7K reputation)SSCrazy (2.7K reputation)SSCrazy (2.7K reputation)SSCrazy (2.7K reputation)SSCrazy (2.7K reputation)SSCrazy (2.7K reputation)

Group: General Forum Members
Points: 2702 Visits: 1399
Nice article and good examples. Thanks

Thanks
Misha_SQL
Misha_SQL
SSCommitted
SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)

Group: General Forum Members
Points: 1646 Visits: 1010
I love the articles with examples which work "out of the box". This is one of those. These very helpful techniques are going into my tool box. Thank you!



gpatterson-825050
gpatterson-825050
Valued Member
Valued Member (60 reputation)Valued Member (60 reputation)Valued Member (60 reputation)Valued Member (60 reputation)Valued Member (60 reputation)Valued Member (60 reputation)Valued Member (60 reputation)Valued Member (60 reputation)

Group: General Forum Members
Points: 60 Visits: 381
Thanks for the time and effort to produce this informative article.

One small thing I noticed is that this solution requires an assumption that a "missing" row in either destination or source implies a 0. Because of the way you are using ISNULL(), a particular ID that exists with a value of 0 on one side and does not exist at all on the other side will never be identified as a difference.

This is probably fine in many cases (e.g. daily sales data), but it might be conceivable to create a circumstance where it would matter. For example, if we are comparing a normalized (to z scores) data set, 0 would have a very different meaning and not be a logical default assumption.

Admittedly, my example is a poor and far fetched one. But I think this assumption is worth pointing out for those who may use this snippet of code.

Geoff
markoos1
markoos1
SSC-Enthusiastic
SSC-Enthusiastic (141 reputation)SSC-Enthusiastic (141 reputation)SSC-Enthusiastic (141 reputation)SSC-Enthusiastic (141 reputation)SSC-Enthusiastic (141 reputation)SSC-Enthusiastic (141 reputation)SSC-Enthusiastic (141 reputation)SSC-Enthusiastic (141 reputation)

Group: General Forum Members
Points: 141 Visits: 249
GPatterson -- that's a very good point. I've run into that exact problem before. I was going to address that in part 2....but as a preview let me just say that the ISNULL() has to resolve to an impossible value, or else the full join will over-join. I had an ISNULL(<field>,0) once where the field had a possible value of 0, and this caused a problem.

Thanks for your input.
ashishtiwarivds
ashishtiwarivds
Grasshopper
Grasshopper (18 reputation)Grasshopper (18 reputation)Grasshopper (18 reputation)Grasshopper (18 reputation)Grasshopper (18 reputation)Grasshopper (18 reputation)Grasshopper (18 reputation)Grasshopper (18 reputation)

Group: General Forum Members
Points: 18 Visits: 22
Nice article. Thanks for your efforts for putting this together.

Just wondering, what will happen if we don't use COALESCE() function? Generally when I am using LEFT JOIN to find delta at source I don't use this? Can you please elaborate?

Thanks again and appriciate your work.

Cheers!!
markoos1
markoos1
SSC-Enthusiastic
SSC-Enthusiastic (141 reputation)SSC-Enthusiastic (141 reputation)SSC-Enthusiastic (141 reputation)SSC-Enthusiastic (141 reputation)SSC-Enthusiastic (141 reputation)SSC-Enthusiastic (141 reputation)SSC-Enthusiastic (141 reputation)SSC-Enthusiastic (141 reputation)

Group: General Forum Members
Points: 141 Visits: 249
Coalesce is needed because you don't know in advance if the value will be on the source side, the destination side, or both. If you only refer to one of the join keys, you will get nulls for the records that don't exist for that side. Coalesce solves this problem by using the first non-null value it finds. Does this make sense?

As an aside, you don't need coalesce for the a left join...only for a full join.
astmart
astmart
Grasshopper
Grasshopper (19 reputation)Grasshopper (19 reputation)Grasshopper (19 reputation)Grasshopper (19 reputation)Grasshopper (19 reputation)Grasshopper (19 reputation)Grasshopper (19 reputation)Grasshopper (19 reputation)

Group: General Forum Members
Points: 19 Visits: 332
Great Article..and well laid out..
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search