SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Use Excel to DIFF Your Query Outputs


Use Excel to DIFF Your Query Outputs

Author
Message
David Ziffer
David Ziffer
SSC Veteran
SSC Veteran (243 reputation)SSC Veteran (243 reputation)SSC Veteran (243 reputation)SSC Veteran (243 reputation)SSC Veteran (243 reputation)SSC Veteran (243 reputation)SSC Veteran (243 reputation)SSC Veteran (243 reputation)

Group: General Forum Members
Points: 243 Visits: 203
Comments posted to this topic are about the item Use Excel to DIFF Your Query Outputs
Anipaul
Anipaul
SSCertifiable
SSCertifiable (7.4K reputation)SSCertifiable (7.4K reputation)SSCertifiable (7.4K reputation)SSCertifiable (7.4K reputation)SSCertifiable (7.4K reputation)SSCertifiable (7.4K reputation)SSCertifiable (7.4K reputation)SSCertifiable (7.4K reputation)

Group: General Forum Members
Points: 7365 Visits: 1407
Nice article...

Happy New Year to you all...



Gregor Borosa
Gregor Borosa
SSC Veteran
SSC Veteran (294 reputation)SSC Veteran (294 reputation)SSC Veteran (294 reputation)SSC Veteran (294 reputation)SSC Veteran (294 reputation)SSC Veteran (294 reputation)SSC Veteran (294 reputation)SSC Veteran (294 reputation)

Group: General Forum Members
Points: 294 Visits: 220
Hi... interesting idea, until now I was using Excel mainly for cross-checking query outputs by pivot tables, while using KDiff for diffing as it is specialized for the job.
Happy NewYear from me, too Wink
Christopher Stobbs
Christopher Stobbs
SSCrazy
SSCrazy (2.5K reputation)SSCrazy (2.5K reputation)SSCrazy (2.5K reputation)SSCrazy (2.5K reputation)SSCrazy (2.5K reputation)SSCrazy (2.5K reputation)SSCrazy (2.5K reputation)SSCrazy (2.5K reputation)

Group: General Forum Members
Points: 2470 Visits: 2233
Great tool.

I'm already started using it and it's a charm.

I made some changes for mine so that it could work in any workbook regardless of the names of the sheets. It will do the campare on the first 2 sheets in the active workbook.

I've pasted my code under my personal xls modules so that my macro becomes global.
Here is the modified code if anyone is interested:



' Clear all the indicators that previous comparisons may have set.
Private Sub ClearMarkers()
' Clear Sheet1

'ThisWorkbook.Sheets.Select
ActiveWorkbook.Sheets(1).Activate
ActiveWorkbook.ActiveSheet.Cells.Select
Selection.Interior.ColorIndex = xlNone
Selection.Font.ColorIndex = 0
ActiveWorkbook.ActiveSheet.Cells(1, 1).Select ' undo selection of entire sheet
' Clear Sheet2
ActiveWorkbook.Sheets(2).Activate
ActiveWorkbook.ActiveSheet.Cells.Select
Selection.Interior.ColorIndex = xlNone
Selection.Font.ColorIndex = 0
ActiveWorkbook.ActiveSheet.Cells(1, 1).Select ' undo selection of entire sheet
End Sub

' Walk through Sheet1 and Sheet2, setting markers wherever differences
' in cell contents are found.
Public Sub Diff_Sheet_1_and_2()

On Error GoTo ErrHandle
Call ClearMarkers

' Determine range of used cells, using the highest row and column counts
' found in either of the two sheets.
Dim HighRow As Long
HighRow = ActiveWorkbook.Sheets(1).UsedRange.Rows.Count
If ActiveWorkbook.Sheets(2).UsedRange.Rows.Count > HighRow Then
HighRow = ActiveWorkbook.Sheets(2).UsedRange.Rows.Count
End If
Dim HighCol As Long
HighCol = ActiveWorkbook.Sheets(1).UsedRange.Columns.Count
If ActiveWorkbook.Sheets(2).UsedRange.Columns.Count > HighCol Then
HighCol = ActiveWorkbook.Sheets(2).UsedRange.Columns.Count
End If

' Walk through the cells of both sheets, comparing and changing colors
' if differences are found.
Dim RowIndex As Long
Dim ColIndex As Long

Dim RowFirst As Long
Dim ColFirst As Long

For RowIndex = 1 To HighRow

For ColIndex = 1 To HighCol

' Compare formulas, not "text" or other formatting-affected attributes.
If ActiveWorkbook.Sheets(1).Cells(RowIndex, ColIndex).Formula <> ActiveWorkbook.Sheets(2).Cells(RowIndex, ColIndex).Formula Then

' Determine how to indicate the difference on Sheet1.
If ActiveWorkbook.Sheets(1).Cells(RowIndex, ColIndex).Text = "" Then
' If the cell contains no text, highlight the empty cell.
ActiveWorkbook.Sheets(1).Select
ActiveWorkbook.Sheets(1).Cells(RowIndex, ColIndex).Select
Selection.Interior.ColorIndex = 38
Else
' If the cell contains text, color the text.
ActiveWorkbook.Sheets(1).Cells(RowIndex, ColIndex).Font.Color = &HFF
End If

' Determine how to indicate the difference on Sheet2.
If ActiveWorkbook.Sheets(2).Cells(RowIndex, ColIndex).Text = "" Then
' If the cell contains no text, highlight the empty cell.
ActiveWorkbook.Sheets(2).Select
ActiveWorkbook.Sheets(2).Cells(RowIndex, ColIndex).Select
Selection.Interior.ColorIndex = 38
Else
' If the cell contains text, color the text.
ActiveWorkbook.Sheets(2).Cells(RowIndex, ColIndex).Font.Color = &HFF
End If

' If this is the first row containing a difference, remember the cell
' in which the change occurred so we can return to it later.
If RowFirst = 0 Then
RowFirst = RowIndex
ColFirst = ColIndex
End If

End If
Next
Next

' Either report no differences or focus on the first difference found.
If RowFirst = 0 Then
MsgBox "No differences!"
Else
If ActiveWorkbook.ActiveSheet.Index = 1 Then
ActiveWorkbook.Sheets(1).Cells(RowFirst, ColFirst).Activate
End If
If ActiveWorkbook.ActiveSheet.Index = 2 Then
ActiveWorkbook.Sheets(2).Cells(RowFirst, ColFirst).Activate
End If
End If

Exit Sub

ErrHandle:
MsgBox Err.Description

End Sub



----------------------------------------------
Try to learn something about everything and everything about something. - Thomas Henry Huxley

w00t
Posting Best Practices
Numbers / Tally Tables

SQL-4-Life
GSquared
GSquared
SSC-Insane
SSC-Insane (23K reputation)SSC-Insane (23K reputation)SSC-Insane (23K reputation)SSC-Insane (23K reputation)SSC-Insane (23K reputation)SSC-Insane (23K reputation)SSC-Insane (23K reputation)SSC-Insane (23K reputation)

Group: General Forum Members
Points: 23755 Visits: 9730
My usual method for comparing data sets from SQL databases is just simply do an outer join between the two. For procs, that means inserting them into temp tables and then joining the temp tables.

For example:


create table #T1 (
CountryName varchar(100) primary key,
ISO2 char(2),
ISO3 char(3),
ContinentName varchar(100));

insert into #T1 (CountryName, ISO2, ISO3, ContinentName)
exec dbo.MyProc1;

create table #T2 (
CountryName varchar(100) primary key,
ISO2 char(2),
ISO3 char(3),
ContinentName varchar(100));

insert into #T2 (CountryName, ISO2, ISO3, ContinentName)
exec dbo.MyProc2;

select *
from #T1 t1
full outer join #T2 t2
on t1.CountryName = t2.CountryName
and t1.ISO2 = t2.ISO2
and t1.ISO3 = t2.ISO3
and t1.ContinentName = t2.ContinentName
where t2.CountryName is null
or t1.CountryName is null;



That will give you a full set of discrepancies between the two record sets.

If you want to change the comparison, change the join. For example, you might want to just know which countries have exact matches, but where the continent is different. So change the join to:


on t1.CountryName = t2.CountryName
and t1.ContinentName != t2.ContinentName



(And get rid of the where clause.)

You'll immediately have those differences. And so on, whatever comparisons you want to do.

Make a change to one of the procs, truncate the table its results are in, rerun the insert statement, and rerun your comparisons, till you get matching result sets.

If the result sets are large enough and the comparisons are complex enough, you can even add indexes to the temp tables to make the queries faster.

Also, if it matters, this doesn't run into the max rows issues that Excel has (though 2007 vastly increased the number of rows allowed).

Not saying this method is better, just saying it's what I use.

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
Chad Crawford
 Chad Crawford
SSCrazy
SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)

Group: General Forum Members
Points: 2856 Visits: 18718
Another tool to put in your belt is EXCEPT:
SELECT * FROM table1
EXCEPT
SELECT * FROM table2


will give you all the records in table1 that do not exactly match a record in table2. It doesn't display them side-by-side, but does tell you which record(s) are different. Reverse it to get the records in table2 that do not exactly match a record in table1.

Another quick shortcut I have used is:
SELECT COUNT(*) FROM table1
SELECT COUNT(*) FROM table2
SELECT COUNT(*)
FROM
(SELECT * FROM table1
UNION
SELECT * FROM table2) Combined


If all three counts are the same, the tables are identical. Again, if there are differences it doesn't tell you where they are, but it is very easy to see if there are differences.

Chad
Nick Thompson-356919
Nick Thompson-356919
Grasshopper
Grasshopper (15 reputation)Grasshopper (15 reputation)Grasshopper (15 reputation)Grasshopper (15 reputation)Grasshopper (15 reputation)Grasshopper (15 reputation)Grasshopper (15 reputation)Grasshopper (15 reputation)

Group: General Forum Members
Points: 15 Visits: 12
Excel is a very versatile tool, as your article demonstrates.

A couple of mods I'd like to see in your code are:

1. Show the value from the other worksheet as a comment (saves skipping back and forth between worksheets to see what the difference that's been identified actually is). You can do this by adding these lines:

                Sheet1.Cells(RowIndex, ColIndex).AddComment "Value in worksheet " & Sheet2.Name & " is " & Sheet2.Cells(RowIndex, ColIndex).Formula
Sheet2.Cells(RowIndex, ColIndex).AddComment "Value in worksheet " & Sheet1.Name & " is " & Sheet1.Cells(RowIndex, ColIndex).Formula



insert the above code just after:

            ' Compare formulas, not "text" or other formatting-affected attributes.
If Sheet1.Cells(RowIndex, ColIndex).Formula <> Sheet2.Cells(RowIndex, ColIndex).Formula Then



NB: If you're running this multiple times on the same data you need to clear the existing comments first.

2. If a row in one sheet doesn't exist in the other sheet, this has the effect of labelling every subsequent row as being different. It would be useful to have the option to look ahead at the next x number of rows in the dataset and use them to determine whether a row's missing, and if it is then to flag that, rather than considering every subsequent row as changed.


Re the earlier post suggesting modifying the code to deal with different sheet names, the code in the article does already do this.

Nick
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search