Click here to monitor SSC
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Using "Multiple Assignment Variables" for a Generic Compare Data Script


Using "Multiple Assignment Variables" for a Generic Compare Data Script

Author
Message
zintp
zintp
SSC Veteran
SSC Veteran (247 reputation)SSC Veteran (247 reputation)SSC Veteran (247 reputation)SSC Veteran (247 reputation)SSC Veteran (247 reputation)SSC Veteran (247 reputation)SSC Veteran (247 reputation)SSC Veteran (247 reputation)

Group: General Forum Members
Points: 247 Visits: 370
Comments posted to this topic are about the item Using "Multiple Assignment Variables" for a Generic Compare Data Script
Paul White
Paul White
SSChampion
SSChampion (11K reputation)SSChampion (11K reputation)SSChampion (11K reputation)SSChampion (11K reputation)SSChampion (11K reputation)SSChampion (11K reputation)SSChampion (11K reputation)SSChampion (11K reputation)

Group: General Forum Members
Points: 11146 Visits: 11353
Not only is it important to name columns (as opposed to using '*') - it is important to name them and return them in the same order!

Nothing in the 'MAV' query guarantees the order of column names returned. Use FOR XML PATH with an explicit ORDER BY clause instead (it's faster and entirely documented):


SELECT STUFF(
(
SELECT N',' + sc.name
FROM sys.columns sc
WHERE sc.[object_id] = OBJECT_ID(N'dbo.Widget', N'U')
ORDER BY
sc.column_id ASC
FOR XML PATH(''), TYPE
).value(N'.[1]', 'NVARCHAR(MAX)'), 1, 1, '');





Paul White
SQLPerformance.com
SQLblog.com
@SQL_Kiwi
vince.iacoboni@db.com
vince.iacoboni@db.com
Right there with Babe
Right there with Babe (767 reputation)Right there with Babe (767 reputation)Right there with Babe (767 reputation)Right there with Babe (767 reputation)Right there with Babe (767 reputation)Right there with Babe (767 reputation)Right there with Babe (767 reputation)Right there with Babe (767 reputation)

Group: General Forum Members
Points: 767 Visits: 549
A quick note on that trailing comma. Instead of

select @col_list = @col_list + col + ', '



which leaves the trailing comma that must be taken off, you might try

set @col_list = null     -- though defaults to null when defined...
select @col_list = nullif(@col_list + ',', '') + col





Bradley Deem
Bradley Deem
Mr or Mrs. 500
Mr or Mrs. 500 (591 reputation)Mr or Mrs. 500 (591 reputation)Mr or Mrs. 500 (591 reputation)Mr or Mrs. 500 (591 reputation)Mr or Mrs. 500 (591 reputation)Mr or Mrs. 500 (591 reputation)Mr or Mrs. 500 (591 reputation)Mr or Mrs. 500 (591 reputation)

Group: General Forum Members
Points: 591 Visits: 1248
I can see where this could come in handly during ETL processes.

In regards to creating a delimited list see the example here on MSDN http://msdn2.microsoft.com/en-us/library/ms131056.aspx. Using a CLR it will create a comma delimited list. Since it's an Aggregate Function you can use it just like any other aggregate which makes it really handy.
yaaadman
yaaadman
Grasshopper
Grasshopper (23 reputation)Grasshopper (23 reputation)Grasshopper (23 reputation)Grasshopper (23 reputation)Grasshopper (23 reputation)Grasshopper (23 reputation)Grasshopper (23 reputation)Grasshopper (23 reputation)

Group: General Forum Members
Points: 23 Visits: 242
Hi Stephen,
I was wondering if there's a reason you use information_schema rather than the new sys.columns? Backwards compatibility? I believe in 2005 onwards information_schema is just a wrapper for the new object views (sys.columns, sys.objects, etc.)...
pbarbin
pbarbin
SSC Rookie
SSC Rookie (30 reputation)SSC Rookie (30 reputation)SSC Rookie (30 reputation)SSC Rookie (30 reputation)SSC Rookie (30 reputation)SSC Rookie (30 reputation)SSC Rookie (30 reputation)SSC Rookie (30 reputation)

Group: General Forum Members
Points: 30 Visits: 349
Thanks for the article, I use this "feature" quite a bit. One of mine is similar to yours, I have a script that creates insert statements using column names in order to reduce the size of our production database for testing purposes. We copy over only the records we want from the production db into an empty db, the process is much faster than deleting and resizing the db.

I realize that your article is just for explanation purposes, but when I compare the contents of two tables that have the same structure, I use the CheckSum_Agg() and CheckSum() functions. You can start by comparing the entire tables with just one checksum value. If the values don't match, you can show the rows that differ by comparing each row's checksum with the checksum of the corresponding tables row (using primary keys to match rows).

Syntax is:

SELECT CheckSum_Agg(CheckSum(*)) FROM TableName
SELECT Checksum(*) FROM TableName WHERE PrimaryKey=1

Paul
stevemc
stevemc
SSC-Enthusiastic
SSC-Enthusiastic (101 reputation)SSC-Enthusiastic (101 reputation)SSC-Enthusiastic (101 reputation)SSC-Enthusiastic (101 reputation)SSC-Enthusiastic (101 reputation)SSC-Enthusiastic (101 reputation)SSC-Enthusiastic (101 reputation)SSC-Enthusiastic (101 reputation)

Group: General Forum Members
Points: 101 Visits: 671
Nice article. A minor nit-pick. You need to initialize the @col variable to an empty string befory running the MAV query. Otherwise, your result will be NULL.

I've also used this for quick and dirty CSV list generation. For long lists, though, say over 500 elements, a FOR XML query, like Paul White's above, is much faster.
john.moreno
john.moreno
Old Hand
Old Hand (383 reputation)Old Hand (383 reputation)Old Hand (383 reputation)Old Hand (383 reputation)Old Hand (383 reputation)Old Hand (383 reputation)Old Hand (383 reputation)Old Hand (383 reputation)

Group: General Forum Members
Points: 383 Visits: 1112
Now, in a previous post of mine discussing comparing tables using EXCEPT <reference>, I noted that it is best practice to explicitly name your columns in any query using EXCEPT.


I saw this in your previous article, but didn't respond to it there...I think you're wrong, and that this is one of the rare instances where it's best NOT to explicitly name your columns. If the column order is different between the two tables, then there may be other differences as well. Of course there can be other differences without the column order being different or the column order could be the only difference, but since it's simpler to use * and the failure provides you with useful information (that the structure of the two tables has branched at some point), that's what I'd use. Of course you may need to do the comparision in any case, in which case todays script can come in handy.

Speaking of todays script, using
@col_list = IsNull(@col_list +',' , '') + col

prevents having to delete a trailing comma.
carl.anderson-1037280
carl.anderson-1037280
Grasshopper
Grasshopper (16 reputation)Grasshopper (16 reputation)Grasshopper (16 reputation)Grasshopper (16 reputation)Grasshopper (16 reputation)Grasshopper (16 reputation)Grasshopper (16 reputation)Grasshopper (16 reputation)

Group: General Forum Members
Points: 16 Visits: 127
If the purpose of this script is to detect whether two tables with identical structure have different data, it might be nice to check the row counts before jumping into a potentially expensive comparison.

Just a thought.


Carl Anderson
Data Architect, Northwestern University

Paul White
Paul White
SSChampion
SSChampion (11K reputation)SSChampion (11K reputation)SSChampion (11K reputation)SSChampion (11K reputation)SSChampion (11K reputation)SSChampion (11K reputation)SSChampion (11K reputation)SSChampion (11K reputation)

Group: General Forum Members
Points: 11146 Visits: 11353
@viacoboni: a construction using COALESCE is normally used, but in any case I can see no reason to prefer any variation of the method over the FOR XML solution (documented, supported, and faster...)

@Bradley Deem: True - and I am a great fan of appropriate CLR usage; however, FOR XML still performs faster than even a CLR UDA - so a call-out to the hosted CLR seems unnecessary...unless you have special requirements of course!

@yaadman: Absoutely - and I was going to include a quote from BOL to make the point that system views are preferred over INFORMATION_SCHEMA...but I didn't want to pile on too much ;-) I agree that it is rarely better to use anything other than the system views like sys.columns in this case.

@pbarbin: Yes - there are many better approaches to this sort of problem - and carl.anderson mentions another worthwhile optimization in a later post (checking row counts). HashBytes, CHECKSUM, the TableDiff utility...it is a long list.

@Steve McNamee: Are you saying that the technique used in the article is faster than FOR XML PATH for fewer than 500 elements? Not nit-picking; I just want to be clear on it. For me, the greater problem is that using variables in this way lacks an ORDER guarantee. Incorrect results, even when produced fractionally more quickly, are rarely to be desired :-P

Good discussion w00t



Paul White
SQLPerformance.com
SQLblog.com
@SQL_Kiwi
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search