Use Excel to DIFF Your Query Outputs

,

Use Excel to DIFF Your Query Outputs

When modifying SQL code or when rewriting SQL-based applications, it is a good idea to ensure that the new outputs from queries, views, stored procedures and user functions match the former outputs, except of course for intended changes. While coding you may even want to improvise queries to compare old algorithms to new algorithms, in order to avoid changing so much code so quickly that you cannot recover from bad changes, and also to verify that intended changes do indeed occur. One way to quickly make such comparisons is to use Excel as your comparison tool.

The "CompareSheets" Spreadsheet

It is fairly easy to build a comparison utility using Excel. You can create the utility yourself using the instructions here, or you can simply download the finished workbook from this page (refer to the section that relates to this article). To compare the outputs of two queries, it makes sense to have an Excel workbook containing two spreadsheets, one to hold each query output. We start by making a new Excel workbook containing two pages named Sheet1 and Sheet2:

I like to widen the cells somewhat from the default, as shown here. You can uniformly set the width (or height) of ALL cells in a sheet by highlighting the entire sheet (click on the corner box at the intersection of the two headers) then adjusting the width of any one column (or the height of any one row). Now save the workbook with the name CompareSheets.xls.

The only remaining thing you need to convert this workbook into a comparison utility is a small amount of VBA code. The Diff routine (second routine below) indicates the presence of discrepancies between Sheet1 and Sheet2 by setting either the foreground or background color of each cell in which a discrepancy is found. The ClearMarkers routine (just below) clears all such "markers" that may have been set in a previous comparison, allowing you to perform multiple data comparisons by repeatedly pasting new versions of one or both sets of data and then rerunning the Diff routine.

Using the Visual Basic editor, insert the ClearMarkersroutine into the ThisWorkbook module:

' Clear all the indicators that previous comparisons may have set.
Private Sub ClearMarkers()
    ' Clear Sheet1
    Sheet1.Select
    Sheet1.Cells.Select
    Selection.Interior.ColorIndex = xlNone
    Selection.Font.ColorIndex = 0
    Sheet1.Cells(1, 1).Select ' undo selection of entire sheet
    ' Clear Sheet2
    Sheet2.Select
    Sheet2.Cells.Select
    Selection.Interior.ColorIndex = xlNone
    Selection.Font.ColorIndex = 0
    Sheet2.Cells(1, 1).Select ' undo selection of entire sheet
End Sub

Here is the Diff routine, which you should also copy into ThisWorkbook::

' Walk through Sheet1 and Sheet2, setting markers wherever differences
' in cell contents are found.
Public Sub Diff_Sheet_1_and_2()
    On Error GoTo ErrHandle
    Call ClearMarkers
    
    ' Determine range of used cells, using the highest row and column counts
    ' found in either of the two sheets.
    Dim HighRow As Long
    HighRow = Sheet1.UsedRange.Rows.Count
    If Sheet2.UsedRange.Rows.Count > HighRow Then
        HighRow = Sheet2.UsedRange.Rows.Count
    End If
    Dim HighCol As Long
    HighCol = Sheet1.UsedRange.Columns.Count
    If Sheet2.UsedRange.Columns.Count > HighCol Then
        HighCol = Sheet2.UsedRange.Columns.Count
    End If
    
    ' Walk through the cells of both sheets, comparing and changing colors
    ' if differences are found.
    Dim RowIndex As Long
    Dim ColIndex As Long
    
    Dim RowFirst As Long
    Dim ColFirst As Long
    
    For RowIndex = 1 To HighRow
    
        For ColIndex = 1 To HighCol
        
            ' Compare formulas, not "text" or other formatting-affected attributes.
            If Sheet1.Cells(RowIndex, ColIndex).Formula <> Sheet2.Cells(RowIndex, ColIndex).Formula Then
            
                ' Determine how to indicate the difference on Sheet1.
                If Sheet1.Cells(RowIndex, ColIndex).Text = "" Then
                    ' If the cell contains no text, highlight the empty cell.
                    Sheet1.Select
                    Sheet1.Cells(RowIndex, ColIndex).Select
                    Selection.Interior.ColorIndex = 38
                Else
                    ' If the cell contains text, color the text.
                    Sheet1.Cells(RowIndex, ColIndex).Font.Color = &HFF
                End If
                
                ' Determine how to indicate the difference on Sheet2.
                If Sheet2.Cells(RowIndex, ColIndex).Text = "" Then
                    ' If the cell contains no text, highlight the empty cell.
                    Sheet2.Select
                    Sheet2.Cells(RowIndex, ColIndex).Select
                    Selection.Interior.ColorIndex = 38
                Else
                    ' If the cell contains text, color the text.
                    Sheet2.Cells(RowIndex, ColIndex).Font.Color = &HFF
                End If
                
                ' If this is the first row containing a difference, remember the cell
                ' in which the change occurred so we can return to it later.
                If RowFirst = 0 Then
                    RowFirst = RowIndex
                    ColFirst = ColIndex
                End If
                
            End If
        Next
    Next
    
    ' Either report no differences or focus on the first difference found.
    If RowFirst = 0 Then
        MsgBox "No differences!"
    Else
        If ThisWorkbook.ActiveSheet.Index = 1 Then
            Sheet1.Cells(RowFirst, ColFirst).Activate
        End If
        If ThisWorkbook.ActiveSheet.Index = 2 Then
            Sheet2.Cells(RowFirst, ColFirst).Activate
        End If
    End If
    
    Exit Sub
    
ErrHandle:
    MsgBox Err.Description
End Sub

The Diff routine first calls ClearMarkers to clear all the "markers" on both sheets that may have been set in a previous comparison. It then effectively computes the "used range" for both sheets simultaneously (i.e. the used range of the union of the data on both sheets) by using the largest "UsedRange" value from either sheet in each dimension. This ensures that if either sheet contains more rows or columns than the other, the Diff routine will flag all such data as differences.

Next, the routine walks through every populated cell on either sheet, comparing the sheets cell-for-cell to see if their formulas differ. Note that the routine uses the cells' formulas, not their text or other properties that might be influenced by formatting. (However even the formulas might be modified by Excel when pasting in your data. For example if you have numeric strings such as zip codes that might have leading zeroes, you'll need to convert the data types of the appropriate columns on both sheets to "text" before pasting in order to prevent unwanted stripping of leading zeroes).

The Diff routine does one more thing: it remembers which cell contained the first discrepancy, so that at the end of the comparison it can move you to that cell in whichever sheet is active. This is very important with very large sets of data, where you might otherwise have to eyeball thousands of rows and columns in order to locate any of the discrepancies that the macro highlighted.

If you are building your own CompareSheets.xls rather than simply downloading it, then at this point you should save the workbook, close Excel and make the file read-only. This will prevent you from inadvertently saving the workbook with any data in it. After making the file read-only, reopen it with Excel.

Now let's run a test case. Let's assume we have a query that produces a list of country names, their corresponding ISO codes, and their respective continents. We present the entire list here in text form so that you can copy and paste it into your own version of CompareSheets.xls and then modify it to match the examples below:

CountryNameISO2ISO3ContinentName
AfghanistanAFAFGAsia
AlbaniaALALBEurope
AlgeriaDZDZAAfrica
American SamoaASASMOceania
AndorraADANDEurope
AngolaAOAGOAfrica
AnguillaAIAIANorth America
AntarcticaAQATAAntarctica
Antigua And BarbudaAGATGNorth America
ArgentinaARARGSouth America
ArmeniaAMARMAsia
ArubaAWABWNorth America
AustraliaAUAUSOceania
AustriaATAUTEurope
AzerbaijanAZAZEAsia
BahamasBSBHSNorth America
BahrainBHBHRAsia
BangladeshBDBGDAsia
BarbadosBBBRBNorth America
BelarusBYBLREurope
BelgiumBEBELEurope
BelizeBZBLZNorth America
BeninBJBENAfrica
BermudaBMBMUNorth America
BhutanBTBTNAsia
BoliviaBOBOLSouth America
Bosnia And HerzegovinaBABIHEurope
BotswanaBWBWAAfrica
Bouvet IslandBVBVTAntarctica
BrazilBRBRASouth America
British Indian Ocean TerritoryIOIOTAsia
Brunei DarussalamBNBRNAsia
BulgariaBGBGREurope
Burkina FasoBFBFAAfrica
BurundiBIBDIAfrica
CambodiaKHKHMAsia
CameroonCMCMRAfrica
CanadaCACANNorth America
Cape VerdeCVCPVAfrica
Cayman IslandsKYCYMNorth America
Central African RepublicCFCAFAfrica
ChadTDTCDAfrica
ChileCLCHLSouth America
ChinaCNCHNAsia
Christmas IslandCXCXRAsia
Cocos (Keeling) IslandsCCCCKAsia
ColombiaCOCOLSouth America
ComorosKMCOMAfrica
CongoCGCOGAfrica
Cook IslandsCKCOKOceania
Costa RicaCRCRINorth America
Cote D'ivoireCICIVAfrica
Croatia (Hrvatska)HRHRVEurope
CubaCUCUBNorth America
CyprusCYCYPAsia
Czech RepublicCZCZEEurope
DenmarkDKDNKEurope
DjiboutiDJDJIAfrica
DominicaDMDMANorth America
Dominican RepublicDODOMNorth America
East TimorTLTLSAsia
EcuadorECECUSouth America
EgyptEGEGYAfrica
El SalvadorSVSLVNorth America
Equatorial GuineaGQGNQAfrica
EritreaERERIAfrica
EstoniaEEESTEurope
EthiopiaETETHAfrica
Falkland Islands (Malvinas)FKFLKSouth America
Faroe IslandsFOFROEurope
FijiFJFJIOceania
FinlandFIFINEurope
FranceFRFRAEurope
French GuianaGFGUFSouth America
French PolynesiaPFPYFOceania
French Southern TerritoriesTFATFAntarctica
GabonGAGABAfrica
GambiaGMGMBAfrica
GeorgiaGEGEOAsia
GermanyDEDEUEurope
GhanaGHGHAAfrica
GibraltarGIGIBEurope
GreeceGRGRCEurope
GreenlandGLGRLNorth America
GrenadaGDGRDNorth America
GuadeloupeGPGLPNorth America
GuamGUGUMOceania
GuatemalaGTGTMNorth America
GuineaGNGINAfrica
Guinea-BissauGWGNBAfrica
GuyanaGYGUYSouth America
HaitiHTHTINorth America
Heard Island & McDonald IslandsHMHMDAntarctica
HondurasHNHNDNorth America
Hong KongHKHKGAsia
HungaryHUHUNEurope
IcelandISISLEurope
IndiaININDAsia
IndonesiaIDIDNAsia
Iran, Islamic Republic OfIRIRNAsia
IraqIQIRQAsia
IrelandIEIRLEurope
IsraelILISRAsia
ItalyITITAEurope
JamaicaJMJAMNorth America
JapanJPJPNAsia
JordanJOJORAsia
KazakhstanKZKAZAsia
KenyaKEKENAfrica
KiribatiKIKIROceania
Korea, Democratic People's Republic OfKPPRKAsia
Korea, Republic OfKRKORAsia
KuwaitKWKWTAsia
KyrgyzstanKGKGZAsia
Laos, People's Democratic RepublicLALAOAsia
LatviaLVLVAEurope
LebanonLBLBNAsia
LesothoLSLSOAfrica
LiberiaLRLBRAfrica
Libyan Arab JamahiriyaLYLBYAfrica
LiechtensteinLILIEEurope
LithuaniaLTLTUEurope
LuxembourgLULUXEurope
MacauMOMACAsia
MacedoniaMKMKDEurope
MadagascarMGMDGAfrica
MalawiMWMWIAfrica
MalaysiaMYMYSAsia
MaldivesMVMDVAsia
MaliMLMLIAfrica
MaltaMTMLTEurope
Marshall IslandsMHMHLOceania
MartiniqueMQMTQNorth America
MauritaniaMRMRTAfrica
MauritiusMUMUSAfrica
MayotteYTMYTAfrica
MexicoMXMEXNorth America
Micronesia, Federated States OfFMFSMOceania
Moldova, Republic OfMDMDAEurope
MonacoMCMCOEurope
MongoliaMNMNGAsia
MontserratMSMSRNorth America
MoroccoMAMARAfrica
MozambiqueMZMOZAfrica
MyanmarMMMMRAsia
NamibiaNANAMAfrica
NauruNRNRUOceania
NepalNPNPLAsia
NetherlandsNLNLDEurope
Netherlands AntillesANANTNorth America
New CaledoniaNCNCLOceania
New ZealandNZNZLOceania
NicaraguaNINICNorth America
NigerNENERAfrica
NigeriaNGNGAAfrica
NiueNUNIUOceania
Norfolk IslandNFNFKOceania
Northern Mariana IslandsMPMNPOceania
NorwayNONOREurope
OmanOMOMNAsia
PakistanPKPAKAsia
PalauPWPLWOceania
PanamaPAPANNorth America
Papua New GuineaPGPNGOceania
ParaguayPYPRYSouth America
PeruPEPERSouth America
PhilippinesPHPHLAsia
PitcairnPNPCNOceania
PolandPLPOLEurope
PortugalPTPRTEurope
Puerto RicoPRPRINorth America
QatarQAQATAsia
ReunionREREUAfrica
RomaniaROROUEurope
Russian FederationRURUSEurope
RwandaRWRWAAfrica
Saint Kitts And NevisKNKNANorth America
Saint LuciaLCLCANorth America
Saint Vincent And The GrenadinesVCVCTNorth America
SamoaWSWSMOceania
San MarinoSMSMREurope
Sao Tome And PrincipeSTSTPAfrica
Saudi ArabiaSASAUAsia
SenegalSNSENAfrica
SerbiaCSSCGEurope
SeychellesSCSYCAfrica
Sierra LeoneSLSLEAfrica
SingaporeSGSGPAsia
Slovakia (Slovak Republic)SKSVKEurope
SloveniaSISVNEurope
Solomon IslandsSBSLBOceania
SomaliaSOSOMAfrica
South AfricaZAZAFAfrica
SpainESESPEurope
Sri LankaLKLKAAsia
Saint HelenaSHSHNAfrica
Saint Pierre And MiquelonPMSPMNorth America
SudanSDSDNAfrica
SurinameSRSURSouth America
Svalbard And Jan Mayen IslandsSJSJMEurope
SwazilandSZSWZAfrica
SwedenSESWEEurope
SwitzerlandCHCHEEurope
Syrian Arab RepublicSYSYRAsia
Taiwan, Province Of ChinaTWTWNAsia
TajikistanTJTJKAsia
Tanzania, United Republic OfTZTZAAfrica
ThailandTHTHAAsia
TogoTGTGOAfrica
TokelauTKTKLOceania
TongaTOTONOceania
Trinidad And TobagoTTTTONorth America
TunisiaTNTUNAfrica
TurkeyTRTURAsia
TurkmenistanTMTKMAsia
Turks And Caicos IslandsTCTCANorth America
TuvaluTVTUVOceania
UgandaUGUGAAfrica
UkraineUAUKREurope
United Arab EmiratesAEAREAsia
United KingdomUKGBREurope
United StatesUSUSANorth America
United States Minor Outlying IslandsUMUMINorth America
UruguayUYURYSouth America
UzbekistanUZUZBAsia
VanuatuVUVUTOceania
Vatican City State (Holy See)VAVATEurope
VenezuelaVEVENSouth America
Viet NamVNVNMAsia
Virgin Islands (British)VGVGBNorth America
Virgin Islands (U.S.)VIVIRNorth America
Wallis And Futuna IslandsWFWLFOceania
Western SaharaEHESHAfrica
YemenYEYEMAsia
ZaireZRZARAfrica
ZambiaZMZMBAfrica
ZimbabweZWZWEAfrica

To do our first comparison, paste the above table into Sheet1 of CompareSheets.xls. Then paste the same data onto Sheet2, and let's see what the macro does when we compare two identical sets of data:

Dismiss the dialog box by clicking the "OK" button. Now let's suppose we have modified the query that produced this output so that it mistakenly truncates every ContinentName to 10 characters. You can simulate the output of such a query by changing every occurrence of "America" to "Amer" in column "D" of Sheet2. Running our macro again we get:

Note two things:

  • Every discrepancy is indicated by red-colored text.
     
  • The macro has indicated the location of the first discrepancy by making it the current cell. This may seem irrelevant in this small example, but it's very important in very large examples..

If you switch over to Sheet1, you'll notice that the same corresponding cells are marked in the same way (and of course the data in them is different).

Now let's suppose that we have modified our query to inadvertently emit an empty string instead of the ContinentName "North America". You can simulate this problem by copying the data from Sheet1 to Sheet2, then substituting an empty string for "North America" in the ContinentName column. Let's run our macro again:


Sheet 1


Sheet 2

Here the results are somewhat different. On Sheet2there is no text to highlight in the differing cells, so the macro highlights the background instead. In the corresponding cells on Sheet1, the text is highlighted as before.

Summary

I've used this macro for years on every database project on which I've ever worked and it has probably saved me hundreds of hours, not to mention the headaches that would have resulted from unknowingly implementing code changes that produced unexpected output differences. This technique is far faster than any reporting-based solution because it allows you to instantly adapt to the schemas of any new queries you might dream up in the course of doing your coding.

System Requirements

This Excel macro was implemented using Excel 2003; it has been tested in both Excel 2003 and Excel 2007.

Author

You can contact me (Dave Ziffer) via the "Contact" link on my company's web site at www.ProjectPro.com. I'm also on LinkedIn. To keep abreast of my upcoming articles and projects, join the LinkedIn "Rapid Application Prototype" group.

Rate

4 (5)

Share

Share

Rate

4 (5)