Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 123»»»

Compare two tables: Which fields are different? Expand / Collapse
Author
Message
Posted Wednesday, May 21, 2008 5:43 PM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Wednesday, August 13, 2014 5:30 PM
Points: 55, Visits: 278
I do not want to reinvent the wheel, but I need to write a procedure that can be passed two table names and it will compare the tables and list the fields that are different. It can also show the old and new values. The assumptions are that the table structures are the same and there is an index on the base table, which would indicate what fields to join by. I have two Oracle scripts that handle this, but I need the same functionality in SQL Server 2005. Does anyone know of a script that does such a detailed comparison of two tables and does it generically with any two tables passed into the procedure?
Post #504916
Posted Wednesday, May 21, 2008 8:58 PM
UDP Broadcaster

UDP BroadcasterUDP BroadcasterUDP BroadcasterUDP BroadcasterUDP BroadcasterUDP BroadcasterUDP BroadcasterUDP Broadcaster

Group: General Forum Members
Last Login: Friday, April 5, 2013 4:43 PM
Points: 1,473, Visits: 1,314
In SQL 2005, you can check sys.syscolumns table.
Post #504958
Posted Wednesday, May 21, 2008 9:22 PM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Wednesday, August 13, 2014 5:30 PM
Points: 55, Visits: 278
I believe the columns metadata would be used, but I am not looking for differences in the table structures. I am looking to compare the data in the two tables. But I do need to know the name of the fields where differences occur.
Post #504962
Posted Wednesday, May 21, 2008 10:12 PM
Mr or Mrs. 500

Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500

Group: General Forum Members
Last Login: Today @ 3:52 PM
Points: 528, Visits: 1,258
I have two Oracle scripts that handle this, but I need the same functionality in SQL Server 2005.


Can you please upload Oracle scripts, can have an idea what it is doing and could try to map it to SQL Server 2k5.
Post #504973
Posted Thursday, May 22, 2008 12:52 AM
SSC Journeyman

SSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC Journeyman

Group: General Forum Members
Last Login: Thursday, March 25, 2010 10:52 PM
Points: 89, Visits: 273
Hi,

Look at the Article, this will help you in passing the tables as parameters into a stored procedure.
http://www.sqlservercentral.com/articles/Stored+Procedures/2977/

Rajesh
Post #505007
Posted Thursday, May 22, 2008 8:20 AM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Wednesday, August 13, 2014 5:30 PM
Points: 55, Visits: 278
Rather than posting a very long Oracle procedure, I will just describe what the Oracle procedure does. First, let me say my solution is derived from an elegant solution from an article by Sanjay Ray (http://www.oracle.com/technology/oramag/code/tips2004/012504.html). I was hoping there was an equivalent SQL Server elegant solution that I could adapt. If not, then perhaps this will make a very good article for someone when I figure it out. :)

Problem: I have a copy of TableA from yesterday and today. I want to know specifically what changed. The adds and deletes are simple, using the key and the EXCEPT and UNION ALL operators (see http://weblogs.sqlteam.com/jeffs/archive/2007/05/02/60194.aspx). For modifications, I want to know the key for the modified records and then the fields that were changed as well as the old and new values. This needs to be in a generic stored procedure so I can audit any two tables.

Solution: The trick is building a huge dynamic SQL statement. Pass the proc two tablenames, one of which is the "base" table for the comparison. The key structure is determined by looking up the fields in the metadata for indexes for the base table. If there is no index (there should be), the first column is used. The two tables will be joined by the key fields and every other field will be compared.

Using the list of columns from the metadata (INFORMATION_SCHEMA.Columns in SS), the query is built for the non-key fields. The SELECT list is built dynamically from the metadata into a concatenated list for each field. Each field will end up looking something like :

CASE
WHEN A.column_name = B.column_name THEN null
ELSE 'column_name=['+A.column_name+'->'+B.column_name+']'
END

There is a delimiter (~) between each field in the SELECT list. So if the fields are equal the null will result in ~~ and if they are unequal, then the name of the column and the values from each table are put between the delimiters.

The dynamic SQL is actually an INSERT statement into another table. This table is then parsed to find the data between the delimiters and report the field names and the old and new values.

Voila! I hope any of this made sense. As I work through each of the processes, I may return with more detailed questions. Right now I was just hoping someone else had already published a solution for SQL Server.
Post #505204
Posted Thursday, May 22, 2008 1:59 PM
Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Tuesday, August 19, 2014 10:01 AM
Points: 3,844, Visits: 3,841
You may want to search SSC for a solution. I know for a fact that this is not the first time your question has been asked in these forums.

You may want to consider a more formal auditing technique if it is feasable. An auditing procedure like what you've suggested will be a very poor performer on large tables as it will have to do a column by column comparison. At the very least, you may want to look at using a timestamp or datetime column to narrow the search down to only those rows where you know for sure changes have been made.




John Rowan

======================================================
======================================================
Forum Etiquette: How to post data/code on a forum to get the best help - by Jeff Moden
Post #505470
Posted Thursday, May 22, 2008 11:00 PM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Thursday, August 21, 2014 12:53 AM
Points: 127, Visits: 414
Hi

You can try like this. i guess it will solve your problem


select distinct a.* from INFORMATION_SCHEMA.COLUMNS a
where a.column_name not in (select column_name from INFORMATION_SCHEMA.COLUMNS b
where b.table_name in ('stp_rule_dtl') ) ---- Table A
and a.table_name in ('stp_rule_esc_dtl') --- Table B
order by a.column_name

Thanks
jaypee.s
Post #505614
Posted Wednesday, June 4, 2008 3:04 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Tuesday, June 3, 2014 4:20 AM
Points: 3, Visits: 124
How about this...

create procedure CompareTables @table1 varchar(100), @table2 varchar(100)
as
declare @sql varchar(8000)
set @sql = 'select ''' + @table1 + ''' as tblName, * from
(select * from ' + @table1 + '
except
select * from ' + @table2 + ') x
union all
select ''' + @table2 + ''' as tblName, * from
(select * from ' + @table2 + '
except
select * from ' + @table1 +') x'

exec(@sql)
Post #511211
Posted Wednesday, June 4, 2008 7:33 AM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Wednesday, August 13, 2014 5:30 PM
Points: 55, Visits: 278
Let me try to briefly explain the solution I have. I apologize in advance for the lack of supporting info, but I do not have time to pretty it up and make it generic for public consumption. Also keep in mind that I am new to SQL Server, so give some slack on poor coding technique. Any suggestions for improvements, especially for performance, are welcome. More details can be provided if desired.

Ultimately I want a table, called Admin.Data_Changes, to be populated with one record for every field that was changed for the base table. It is designed for keeping the old and new values, but time has not permitted me to implement that part yet. This table is used by my users to audit changes from yesterday. So they have reports that query the Data_Changes table and report what they want.

CREATE TABLE [Admin].[Data_Changes] ( 
[table_name] [varchar] ( 256 )
,[action] [varchar] ( 10 )
,[key_fields] [varchar] ( 1000 )
,[key_values] [varchar] ( 1000 )
,[fieldname] [varchar] ( 256 )
,[old_value] [varchar] ( 255 )
,[new_value] [varchar] ( 255 )
,[dw_load_dttm] [datetime]NULL
)

I have a stored procedure which is passed the name of the base table and the comparison table. There are some basic edit checks to insure the table structures match, the tables exist and the base table has a primary key. Then I call another stored procedure which does all the hard work and populates a staging table with one record per modified record. The staging table has the structure ...

CREATE TABLE [Stage].[Data_Changes] (
[Table_Name] [varchar] (256) ,
[Action] [varchar] (10) ,
[Key_Fields] [varchar] (1000) ,
[Key_Values] [varchar] (1000) ,
[Changes] [varchar] (-1)

After the staging table is populated for the base table, the final table is populated. First any previous records for this table are removed and then the staging data is parsed into one record for each changed field. @Table1 is the base table.

    --  Clear out any previous audit records
DELETE FROM Admin.Data_Changes WHERE Table_Name = @Table1

-- Parse the staging records into one record for each field changed
INSERT INTO Admin.Data_Changes (Table_name, Action, Key_fields, Key_values, Fieldname, DW_load_dttm)
SELECT Table_name, Action, Key_fields, Key_values,
SUBSTRING('~'+ Changes +'~',N+1,CHARINDEX('~','~'+ Changes +'~',N+1)-N-1) AS Fieldname,
getdate()
FROM Admin.Tally a CROSS JOIN Stage.Data_Changes b
WHERE a.N < LEN('~'+ Changes +'~')
AND SUBSTRING('~'+ Changes +'~',N,1) = '~'
AND SUBSTRING('~'+ Changes +'~',N+1,1) <> '~'
AND table_name = @Table1


Now for the hard working stored procedure. The setup ...


CREATE PROCEDURE [Admin].[Data_Changes_Staging]
@p_base_tablename varchar(256),
@p_comp_tablename varchar(256),
@p_details varchar(256)
AS
set nocount on
BEGIN
DECLARE
@Table1 varchar(256),
@Schema1 varchar(256),
@Table2 varchar(256),
@Schema2 varchar(256),
@Key_Fields varchar(max),
@Key_Values varchar(max),
@Changes varchar(max),
@Where varchar(max),
@From varchar(max),
@SQL varchar(max),
@i integer ;

BEGIN TRY
SELECT @Table1 = PARSENAME(@p_base_tablename, 1),
@Schema1 = PARSENAME(@p_base_tablename, 2),
@Table2 = PARSENAME(@p_comp_tablename, 1),
@Schema2 = PARSENAME(@p_comp_tablename, 2)

DELETE FROM Stage.Data_Changes WHERE Table_Name = @Table1


The final dynamic SQL will be an INSERT statement into the staging table, Stage.Data_Changes. The pieces of that INSERT statement are built dynamically from the columns and constraints system metadata tables. First, piece together the FROM clause.

  SET @From = ' FROM ' + @p_base_tablename + ' A, ' + @p_comp_tablename + ' B '

Then the table Information_Schema.Table_Constraints is used to get the columns used for the primary key. This info is used to build the WHERE clause that will join the two tables. It is also used to populate two of the fields in the table. @Key_Fields is the primary key field names strung together. @Key_Values is the values of those primary key fields, strung together. Note that the values are converted to varchar so they can be strung together.

  SELECT
@Where =
stuff( ( select ' and A.' + k1.column_name + ' = B.' + k1.column_name
from information_schema.key_column_usage k1
where k1.constraint_name = c.constraint_name
order by ordinal_position
for xml path('') )
, 1,4,' WHERE '),
@Key_Fields =
stuff( ( select '~' + k1.column_name
from information_schema.key_column_usage k1
where k1.constraint_name = c.constraint_name
order by ordinal_position
for xml path('') )
, 1,1,'') ,

      @Key_Values = 
stuff( ( select '+ ''~'' + convert(varchar,A.' + k1.column_name + ')'
from information_schema.key_column_usage k1
where k1.constraint_name = c.constraint_name
order by ordinal_position
for xml path('') )
, 1,8,'')
FROM INFORMATION_SCHEMA.Table_Constraints c
WHERE c.table_schema = @Schema1
AND c.table_name = @Table1
AND c.constraint_type = 'PRIMARY KEY'
GROUP BY constraint_name

The only thing left is figuring out all the changes in the fields. The table Information_Schema.Columns is used to dynamically get all the columns for the tables. For each column, a CASE statement is built. When the two columns are equal it simply concatenates the delimiter to the string and if they are not equal, then it concatenates the name of the column and the delimiter. At this time, it does not give the values from each table, but that will be added to this statement, with changes only to the string and not the logic. Note a couple things when stringing @Changes. In order to handle all data types and nulls, the columns are converted to varchar and set to '' if null. Also note that in my example, I have excluded some column names that may appear in my tables that I am not worried about changing. The load dates and times are expected to be different from yesterday and today, so no point checking those and I do not want to know if someone's age changed because today is their birthday. If the Birthdate is changed, then I will pick that up. This is completely optional for this logic to work.

  SELECT @Changes = 
stuff( ( select '+ case when isnull(convert(varchar,A.' + c1.column_name +
'),'''') = isnull(convert(varchar,B.' + c1.column_name +
'),'''') then ''~'' else ''' + c1.column_name + '~'' end '
from information_schema.columns c1
where c1.column_name not in ('LOAD_DTTM','LASTUPDDTTM','DW_LOAD_DTTM','UW_LOAD_DTTM','AGE')
and c1.table_schema = c2.table_schema
and c1.table_name = c2.table_name for xml path('') )
, 1,1,'')
FROM information_schema.columns c2
WHERE table_schema = @Schema1 and table_name = @Table1
GROUP BY c2.table_schema, table_name

That's it! Piece all this together and clean it up and you can find out what fields changed when comparing any two tables.
Post #511362
« Prev Topic | Next Topic »

Add to briefcase 123»»»

Permissions Expand / Collapse