This problem of finding matches based on the column proved more difficult than I anticipated once I started peeling the onion. To make it worth the effort, I wanted it to work for any number of columns in any order. DelimitedSplit8K to the rescue!
The problem is much easier if the first and second column orders are limited and fixed as in the OP. But then the procedure would likely be of little use except to solve one particular problem. As it is, I'm not sure what it's good for. Maybe I could use it to compare name and address columns after importing and parsing demographic data from a spreadsheet where the entire name and address were in a single unformatted column?
Whatever...the OP seemed to have a need for a solution and I hope this is more flexible and more efficient than trying to hard-code long column-lists with multiple replaces in a join. I didn't test it for scalability so before using it on very large tables it would need some thorough testing and performance tuning.
First some sample data.
--This represents a source table with columns where we want
--to look for matches in other columns in the same row. If we
--want to compare two different tables the relevant columns
--could be consolidated into one table like this example.
IF OBJECT_ID('tempdb..#SourceTable') IS NOT NULL
DROP TABLE #SourceTable
CREATE TABLE #SourceTable
(
id INT IDENTITY(1,1) NOT NULL
,ColA VARCHAR(10)
,ColB VARCHAR(10)
,ColC VARCHAR(10)
,ColD VARCHAR(10)
,ColE VARCHAR(10)
,ColF VARCHAR(10)
,ColG VARCHAR(10)
,ColH VARCHAR(10)
,ColI VARCHAR(10)
,ColJ VARCHAR(10)
,PRIMARY KEY (id)
);
INSERT INTO #SourceTable
SELECT 'Alpha','Bravo','Alpha','Zulu','Alpha','Bravo','Alpha','Papa','Quebec','Romeo'
UNION ALL
SELECT 'Delta','Echo','Foxtrot','Zulu','Delta','Echo','Foxtrot','Papa','Romeo','India'
UNION ALL
SELECT 'Charlie','Bravo','Alpha','Zulu','Charlie','Bravo','Alpha','Papa','Quebec','Delta'
UNION ALL
SELECT 'Xray','Yankee','Zulu','Zulu','Xray','Yankee','Zulu','Papa','Hotel','Hotel'
UNION ALL
SELECT 'Alpha','Bravo','Alpha','Zulu','Alpha','Bravo','Alpha','Papa','Hotel','Romeo'
UNION ALL
SELECT 'Delta','Golf','Delta','Zulu','Foxtrot','Golf','Echo','Hotel','Quebec','India'
UNION ALL
SELECT '','','','Zulu','Alpha','Bravo','Alpha','Papa','Quebec','Romeo'
UNION ALL
SELECT 'Delta','Golf','Delta','Zulu','','','','Papa','Quebec','Romeo';
Now create a procedure to do the matching:
CREATE PROCEDURE dbo.uspGetColumnMatch
@TableKeys VARCHAR(8000)
,@XrefKeys VARCHAR(8000)
,@isMatch BIT = 1
AS
BEGIN
SET NOCOUNT ON
/* For the data based on the table key cols */
IF OBJECT_ID('tempdb..#TableKeys') IS NOT NULL
DROP TABLE #TableKeys
CREATE TABLE #TableKeys
(
id INT IDENTITY(1,1) NOT NULL
,RefKeys VARCHAR(8000)
,PRIMARY KEY (id)
)
/* For the data based on the xref key cols */
IF OBJECT_ID('tempdb..#XrefKeys') IS NOT NULL
DROP TABLE #XrefKeys
CREATE TABLE #XrefKeys
(
id INT IDENTITY(1,1) NOT NULL
,RefKeys VARCHAR(8000)
,PRIMARY KEY (id)
)
DECLARE
@strTableKeys VARCHAR(8000)
,@strXrefKeys VARCHAR(8000)
,@sqlTableKeys VARCHAR(8000)
,@sqlXrefKeys VARCHAR(8000)
,@strValues1 VARCHAR(8000)
,@strValues2 VARCHAR(8000)
,@Delimiter1 VARCHAR(10)
,@Delimiter2 VARCHAR(10)
/* The delimiters are set to variables so they are easy to change */
SET @Delimiter1 = '|'
SET @Delimiter2 = '|'
/* Use the key columns to populate the temp tables */
SELECT
@strTableKeys = COALESCE(@strTableKeys + ',','')
+ CAST(dsk1.Item AS VARCHAR(8000))
FROM
dbo.DelimitedSplit8K(@TableKeys,',') AS dsk1
SELECT
@strXrefKeys = COALESCE(@strXrefKeys + ',','')
+ CAST(dsk2.Item AS VARCHAR(8000))
FROM
dbo.DelimitedSplit8K(@XrefKeys,',') AS dsk2
SET @sqlTableKeys =
'INSERT INTO #TableKeys (RefKeys) '+
'SELECT RTRIM(LTRIM(REPLACE('+REPLACE(@strTableKeys,',','+'',''+')+','' '',''''))) FROM #SourceTable'
EXEC(@sqlTableKeys)
SET @sqlXrefKeys =
'INSERT INTO #XrefKeys (RefKeys) '+
'SELECT RTRIM(LTRIM(REPLACE('+REPLACE(@strXrefKeys,',','+'',''+')+','' '',''''))) FROM #SourceTable'
EXEC(@sqlXrefKeys)
/* Create a two-dimensional array (delimited list) */
SELECT
@strValues1 = COALESCE(@strValues1 + @Delimiter1,'')
+ CAST(RefKeys AS VARCHAR(8000))
FROM
#TableKeys
SELECT
@strValues2 = COALESCE(@strValues2 + @Delimiter2,'')
+ CAST(RefKeys AS VARCHAR(8000))
FROM
#XrefKeys
/* Check for matches */
SELECT
dsk1.ItemNumber
,dsk1.Item AS Item1
,dsk2.Item AS Item2
,(CASE
WHEN dsk1.Item = dsk2.Item THEN 1
ELSE 0
END)
AS isMatch
FROM
dbo.DelimitedSplit8K(@strValues1,@Delimiter1) AS dsk1
CROSS APPLY
dbo.DelimitedSplit8K(@strValues2,@Delimiter2) AS dsk2
WHERE
dsk1.ItemNumber = dsk2.ItemNumber
AND ((@isMatch = 1 AND dsk1.Item = dsk2.Item)
OR
(@isMatch = 0))
END
Now you can look for matches in any crazy column order you can think of for an unlimited (well, at least up to 8K) number of columns. I threw in a flag to show or hide the unmatched columns.
EXEC dbo.uspGetColumnMatch
'ColA,ColB,ColC,ColC'
,'ColE,ColF,ColE,ColG'
,1
EXEC dbo.uspGetColumnMatch
'ColA,ColH,ColI,ColJ'
,'ColG,ColH,ColI,ColJ'
,0
EXEC dbo.uspGetColumnMatch
'ColA,ColB,ColC,ColD,ColE,ColF,ColI'
,'ColE,ColF,ColG,ColD,ColA,ColB,ColJ'
,0