Compare SQL Server Results of Two Queries

By:   |   Comments (8)   |   Related: > TSQL


Problem

You want to compare the results from two similar SQL Server T-SQL queries to see if the results are identical. Possibly, you have a complex query that returns many rows that you want to modify to simplify or optimize or for any other reason. You want to ensure the new result set is identical to the original result set.  This require more than just comparing row counts. Saving the data to a file and using a compare tool is one way. However, you can use the power of SQL Server to do the compare to let you know if the results including all column values, are identical or not.

Solution

This tip will use Microsoft's sample database WideWorldImporters. I will show you how to quickly check the results from 2 queries to see if they are identical.

First Query – Overly complex

I mocked up an overly complex query that returns 20,895 rows.  This query has subqueries in the main SELECT and the WHERE clause.  Though I made up this example it is common to come across overly complex queries like this.

USE WideWorldImporters;
GO

SELECT 
   DISTINCT ISNULL((SELECT c.ColorName 
                    FROM [Warehouse].[Colors] c 
                    WHERE c.ColorID = si.ColorID),pt.PackageTypeName) AS 'ColorOrType',
   ISNULL(pt.PackageTypeName,(SELECT c.ColorName 
                              FROM [Warehouse].[Colors] c 
                              WHERE c.ColorID = si.ColorID)) AS 'TypeOrColor',
   si.StockItemID,
   si.StockItemName,
   pt.PackageTypeName,
   (SELECT c.ColorName 
    FROM [Warehouse].[Colors] c 
    WHERE c.ColorID = si.ColorID) AS ColorName,
   CASE WHEN p.SearchName LIKE '% H%'
        THEN SUBSTRING(p.SearchName,1,CHARINDEX(' ',p.SearchName)) 
        ELSE 'XXXXXX'
   END AS 'FirstName',
   TRIM(SUBSTRING(TRIM(SUBSTRING(p.SearchName, CHARINDEX(' ',p.SearchName), 50)), CHARINDEX(' ',p.SearchName), 50) ) AS 'LastName',
   il.*
FROM Warehouse.StockItems AS si
   LEFT JOIN [Warehouse].[PackageTypes] as pt ON pt.PackageTypeID = si.OuterPackageID
   LEFT JOIN [Sales].[InvoiceLines] as il ON il.StockItemID = si.StockItemID
   LEFT JOIN [Sales].[Invoices] as i ON i.InvoiceID = il.InvoiceID
   LEFT JOIN  [Application].[People] p ON p.PersonID = i.SalespersonPersonID
WHERE 
   (pt.PackageTypeName NOT like '__ch' OR EXISTS (SELECT c.ColorName 
                                                  FROM [Warehouse].[Colors] c 
                                                  WHERE c.ColorID = si.ColorID 
                                                    AND c.ColorName NOT like '%K'))
   AND il.InvoiceID IN (SELECT i.InvoiceID 
                        From [Sales].[Invoices] i 
                        INNER JOIN  [Application].[People] p ON  p.PersonID = i.SalespersonPersonID
                        WHERE p.IsSalesperson = 1 
                          AND p.SearchName LIKE '% H[io]%')
GO

Second Query - Simplified

After inspecting the first query and working to simplify it by making the following changes:

  • LEFT JOIN to [Warehouse].[Colors] used vs 4 Subquery
  • Switching [Application].[People] p.SearchName to P.FullName and simplifying the Substring function and Case Statement
  • Changed the [Sales].[Invoices] Subquery to INNER JOIN
  • Changing LEFT JOINS to INNER JOINS and removing Select DISTINCT

This query also returns 20,895 rows!  However, are the result sets identical?

USE WideWorldImporters;
GO

SELECT 
   ISNULL(c.ColorName,pt.PackageTypeName) AS 'ColorOrType',
   ISNULL(pt.PackageTypeName,c.ColorName) AS 'TypeOrColor',
   si.StockItemID,
   si.StockItemName,
   pt.PackageTypeName,
   c.ColorName,
   CASE WHEN P.FullName LIKE '% H%'
        THEN SUBSTRING(P.FullName,1,CHARINDEX(' ',P.FullName)) 
        ELSE 'XXXXXX'
   END AS 'FirstName',
   TRIM(SUBSTRING(p.FullName,CHARINDEX(' ',p.FullName), 50)) AS 'LastName',
   il.*
FROM Warehouse.StockItems AS si
   LEFT  JOIN [Warehouse].[Colors] as c on c.ColorID = si.ColorID
   INNER JOIN [Warehouse].[PackageTypes] as pt ON pt.PackageTypeID = si.OuterPackageID
   INNER JOIN [Sales].[InvoiceLines] as il ON il.StockItemID = si.StockItemID
   INNER JOIN [Sales].[Invoices] as i ON i.InvoiceID = il.InvoiceID
   INNER JOIN  [Application].[People] p ON p.PersonID = i.SalespersonPersonID
WHERE (pt.PackageTypeName <> 'Each' OR c.ColorName <> 'Black')
  AND p.IsSalesperson = 1 
  AND p.FullName LIKE '% H%'
GO

Comparing the Results of the Two Queries

The solution to this is very simple.  Run both queries using a UNION to combine the results! 

The UNION operator returns unique records. If the two results sets are identical the row count will remain the same as the original query.  If any rows do not match the original query those rows will be included in the result set thus the total rows returned by the UNIONed queries will exceed the row count of the original. 

In this case the result sets match, so I can feel confident that the new optimized simplified query is a better solution!

USE WideWorldImporters;
GO

--Example 1: Overly complex query that returns 20,895 rows 
SELECT 
   DISTINCT ISNULL((SELECT c.ColorName 
                    FROM [Warehouse].[Colors] c 
                    WHERE c.ColorID = si.ColorID),pt.PackageTypeName) AS 'ColorOrType',
   ISNULL(pt.PackageTypeName,(SELECT c.ColorName 
                              FROM [Warehouse].[Colors] c 
                              WHERE c.ColorID = si.ColorID)) AS 'TypeOrColor',
   si.StockItemID,
   si.StockItemName,
   pt.PackageTypeName,
   (SELECT c.ColorName 
    FROM [Warehouse].[Colors] c 
    WHERE c.ColorID = si.ColorID) AS ColorName,
   CASE WHEN p.SearchName LIKE '% H%'
        THEN SUBSTRING(p.SearchName,1,CHARINDEX(' ',p.SearchName)) 
        ELSE 'XXXXXX'
   END AS 'FirstName',
   TRIM(SUBSTRING(TRIM(SUBSTRING(p.SearchName, CHARINDEX(' ',p.SearchName), 50)), CHARINDEX(' ',p.SearchName), 50) ) AS 'LastName',
   il.*
FROM Warehouse.StockItems AS si
   LEFT JOIN [Warehouse].[PackageTypes] as pt ON pt.PackageTypeID = si.OuterPackageID
   LEFT JOIN [Sales].[InvoiceLines] as il ON il.StockItemID = si.StockItemID
   LEFT JOIN [Sales].[Invoices] as i ON i.InvoiceID = il.InvoiceID
   LEFT JOIN  [Application].[People] p ON p.PersonID = i.SalespersonPersonID
WHERE 
   (pt.PackageTypeName NOT like '__ch' OR EXISTS (SELECT c.ColorName 
                                                  FROM [Warehouse].[Colors] c 
                                                  WHERE c.ColorID = si.ColorID 
                                                    AND c.ColorName NOT like '%K'))
   AND il.InvoiceID IN (SELECT i.InvoiceID 
                        From [Sales].[Invoices] i 
                        INNER JOIN  [Application].[People] p ON  p.PersonID = i.SalespersonPersonID
                        WHERE p.IsSalesperson = 1 
                          AND p.SearchName LIKE '% H[io]%')

UNION  -- Use UNION to Ensure the result set are identical!
--Example 2: Clean up Query
SELECT 
   ISNULL(c.ColorName,pt.PackageTypeName) AS 'ColorOrType',
   ISNULL(pt.PackageTypeName,c.ColorName) AS 'TypeOrColor',
   si.StockItemID,
   si.StockItemName,
   pt.PackageTypeName,
   c.ColorName,
   CASE WHEN P.FullName LIKE '% H%'
        THEN SUBSTRING(P.FullName,1,CHARINDEX(' ',P.FullName)) 
        ELSE 'XXXXXX'
   END AS 'FirstName',
   TRIM(SUBSTRING(p.FullName,CHARINDEX(' ',p.FullName), 50)) AS 'LastName',
   il.*
FROM Warehouse.StockItems AS si
   LEFT  JOIN [Warehouse].[Colors] as c on c.ColorID = si.ColorID
   INNER JOIN [Warehouse].[PackageTypes] as pt ON pt.PackageTypeID = si.OuterPackageID
   INNER JOIN [Sales].[InvoiceLines] as il ON il.StockItemID = si.StockItemID
   INNER JOIN [Sales].[Invoices] as i ON i.InvoiceID = il.InvoiceID
   INNER JOIN  [Application].[People] p ON p.PersonID = i.SalespersonPersonID
WHERE (pt.PackageTypeName <> 'Each' OR c.ColorName <> 'Black')
  AND p.IsSalesperson = 1 
  AND p.FullName LIKE '% H%'
GO

Conclusion – Reviewing the Results

Now you know the secret to quickly use the power of SQL Server to compare two query results to ensure they are identical.

I have often used this technique when optimizing queries and stored procedures.  It is much easier and faster than extracting the result set to another tool.  The SQL Engine is very powerful and can compare millions of rows with a UNION very quickly. Using the power of SQL Server is the way to go! 

If the results set row counts match you are golden.  However, if the result sets do not match in row counts it may be difficult to find the rows that do not match especially with large result sets.  Some tips to find the rows that do not match include running both result sets with output to grid (in SQL Server Management Studio), trying different sort orders on rows such as dates, IDs, names, numbers like invoice numbers or order numbers.  Inspect the 2 result sets and look at both ends of the result set looking for the unmatched rows. Also, you can comment out some of the columns from both queries until you do get the query results to match, then use process of elimination to figure out the columns that do not match.

What if the you want to compare the results of queries or complex stored procedure or those that do inserts or updates?  In these cases, during testing add a Global Temporary Table (##temp) to hold the result sets by replacing the INSERT or UPDATE with a SELECT INTO ##temp statement or adding an Output Clause to the INSERT or UPDATE Statement. After executing both stored procs, compare the temporary tables with a UNION to ensure the record counts match the original query.

Note: In stored procedures, to compare the results, you must use global temporary tables because local temporary tables are dropped after a store procedure executes.

Next Steps


sql server categories

sql server webinars

subscribe to mssqltips

sql server tutorials

sql server white papers

next tip



About the author
MSSQLTips author Jim Evans Jim Evans is an IT Manager currently for Crowe who has managed DBA, Developer, BI and Data Management teams for over 20 years.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips



Comments For This Article




Sunday, September 26, 2021 - 1:35:42 PM - lptk Back To Top (89280)
Here's some code to exemplify that this strategy may fail:


-- UNION ESTRATEGY
DECLARE @AAA TABLE([CD_PEDIDO_PALM] bigint NOT NULL)
INSERT INTO @AAA
VALUES (1),(2),(3),(4),(5),(6),(7)

DECLARE @bbb TABLE([CD_PEDIDO_PALM] bigint NOT NULL)
INSERT INTO @bbb
VALUES (1),(2),(3),(4),(5),(5),(5)



Declare @aux1 int;
set @aux1 = (select count(*) from @BBB);

Declare @aux2 int;
set @aux2 = (SELECT COUNT(*) FROM @AAA)

Declare @aux3 int;
set @aux3 = (SELECT COUNT(*) FROM (select * from @AAA union SELECT * FROM @bbb) as tb) -- for union to work it needs a alias


IF @aux1 = @aux2
begin
IF @aux1 = @aux3
BEGIN
PRINT 'Flow @flows_name was SUCCESSFUL!'
END
ELSE
BEGIN
PRINT 'Flow @flows_name failed.'
END
end
else
begin
PRINT 'Flow @flows_name failed.'
end

Tuesday, November 12, 2019 - 10:04:08 AM - Gina DeWildt Back To Top (83070)

I always use EXCEPT and switch the two queries so that you see the exact differences between the two result sets.


Monday, November 11, 2019 - 3:29:08 PM - Alex Back To Top (83060)

Hi Jim,
Thank you for a very helpful article !!


Monday, November 11, 2019 - 2:26:07 PM - Steve H Back To Top (83059)

UNION is a simple yet creative solution here, but it should be noted that this will only work if there are no duplicate rows within the result set.  If there are, this technique could give the wrong impression.    

As a simplified example, if result set 1 is (1, 1, 2, 3) and result set 2 is (1, 2, 3, 4) their union will give (1, 2, 3, 4) which is the correct number of rows, but the two sets are clearly not identical.  


Tuesday, October 29, 2019 - 10:30:54 PM - Jim Evans Back To Top (82928)

Thank you Ambrozyhorn for suggesting using NTERSECT OR EXCEPT to find the differences if the results row count don't match.
A very helpful suggestion!


Monday, October 28, 2019 - 10:27:14 AM - Joseph Morgan Back To Top (82920)

Lovely, clever use of UNION's properties. I love this sort of quick and simple shortcuts to get things done. Thanks for sharing!


Wednesday, October 23, 2019 - 6:05:54 PM - Francisco Mínguez Aroca Back To Top (82881)

Hi, I'm afraid that the procedure to decide that both SQLs give the same result is not correct.

If A U B = A we can only conclude that B is included in A, but not necessarily the same. For example, if l, at SQL A = {1,2,3} and SQL B = {1,2}, SQL A UNION SQL B = {1,2,3} and SQL A <> SQL B.

To verify the equality of two results, the property must be applied: A = B iif A - B = ø and B - A = ø, which in T-SQL is the EXCEPT clause.

That is, if SQL A EXCEPT SQL B empty and SQL B EXCEPT SQL A empty, we can safely conclude that SQL A = SQL B.

Thank you.


Wednesday, October 23, 2019 - 9:31:19 AM - Ambrozyhorn Back To Top (82873)

Hi Jim,

We could use INTERSECT OR EXCEPT when shows differents results.

thanks for explanation!















get free sql tips
agree to terms