SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Joins


Joins

Author
Message
peacesells
peacesells
Old Hand
Old Hand (346 reputation)Old Hand (346 reputation)Old Hand (346 reputation)Old Hand (346 reputation)Old Hand (346 reputation)Old Hand (346 reputation)Old Hand (346 reputation)Old Hand (346 reputation)

Group: General Forum Members
Points: 346 Visits: 1481
CREATE TABLE TableKeys ( BusinessKeys NVARCHAR(MAX))

INSERT INTO TableKeys (BusinessKeys)
SELECT SourcTable.ColumnA+','+ SourcTable.ColumnB+','+ SourcTable.ColumnC+','+SourcTable.ColumnC FROM SourcTable

AND I perform a join Like this


SELECT blah..bla....
FROM TableKeys tk JOIN DimTable dl ON

RTRIM(LTRIM(REPLACE(REPLACE(tk.BusinessKeys,',',''),' ',''))) JOIN RTRIM(LTRIM(REPLACE(dl.ColumnE+dl.ColumnF+dl.ColumnE+dl.ColumnG,' ','')))

This join method is effective if number of businesskeys is 2 and when key columns are short but if it gets bigger ,it takes forever. Any alternatives to implement this type of string split joins
GilaMonster
GilaMonster
SSC Guru
SSC Guru (90K reputation)SSC Guru (90K reputation)SSC Guru (90K reputation)SSC Guru (90K reputation)SSC Guru (90K reputation)SSC Guru (90K reputation)SSC Guru (90K reputation)SSC Guru (90K reputation)

Group: General Forum Members
Points: 90307 Visits: 45284
That is going to be incredibly inefficient. What are you trying to do there?

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass


peacesells
peacesells
Old Hand
Old Hand (346 reputation)Old Hand (346 reputation)Old Hand (346 reputation)Old Hand (346 reputation)Old Hand (346 reputation)Old Hand (346 reputation)Old Hand (346 reputation)Old Hand (346 reputation)

Group: General Forum Members
Points: 346 Visits: 1481
GilaMonster (2/9/2013)
That is going to be incredibly inefficient. What are you trying to do there?


Yes I agree with you , when u say its going to be incredibly inefficient. As illustrated, i am taking comma separated business key from one table and join to columns from tableB and update attributes in tableB.
GilaMonster
GilaMonster
SSC Guru
SSC Guru (90K reputation)SSC Guru (90K reputation)SSC Guru (90K reputation)SSC Guru (90K reputation)SSC Guru (90K reputation)SSC Guru (90K reputation)SSC Guru (90K reputation)SSC Guru (90K reputation)

Group: General Forum Members
Points: 90307 Visits: 45284
May I suggest normalising the tables? As long as there are comma-delimited keys that's going to be hell to write queries against and will likely have horrid performance problems.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass


Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (89K reputation)SSC Guru (89K reputation)SSC Guru (89K reputation)SSC Guru (89K reputation)SSC Guru (89K reputation)SSC Guru (89K reputation)SSC Guru (89K reputation)SSC Guru (89K reputation)

Group: General Forum Members
Points: 89477 Visits: 41144
I normally agree that storing CSVs in a table is almost as bad as storing XML in a table. ;-)

However, in this case, we're looking at the whole business key and not just elements embedded in the CSV. That means there may be a fairly easy work around with some very high performance available.

Using the respective formulas that you used in the ON criteria, create a persisted computed column in each table and index it. It's a pretty simple thing to do and makes for some nasty fast code.

BEFORE YOU DO THAT, THOUGH!!!! Adding a column to a table can break improperly written apps that use embedded INSERT statements that don't have an insert-column-list specified. The same goes for stored procedures.

There's a workaround for even that, though. Rename the old table by adding a suffix of your choosing to it. Create a "pass through" view using the problem columns in the SELECT list and name it the same as the original name of the table. Then add the new columns to the old table and you're done. Well, unless you want to hunt down the idiots that wrote the crap code that required the creatio of the view. :-P

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Steven Willis
Steven Willis
SSC Eights!
SSC Eights! (857 reputation)SSC Eights! (857 reputation)SSC Eights! (857 reputation)SSC Eights! (857 reputation)SSC Eights! (857 reputation)SSC Eights! (857 reputation)SSC Eights! (857 reputation)SSC Eights! (857 reputation)

Group: General Forum Members
Points: 857 Visits: 1721
This problem of finding matches based on the column proved more difficult than I anticipated once I started peeling the onion. To make it worth the effort, I wanted it to work for any number of columns in any order. DelimitedSplit8K to the rescue!

The problem is much easier if the first and second column orders are limited and fixed as in the OP. But then the procedure would likely be of little use except to solve one particular problem. As it is, I'm not sure what it's good for. Maybe I could use it to compare name and address columns after importing and parsing demographic data from a spreadsheet where the entire name and address were in a single unformatted column?

Whatever...the OP seemed to have a need for a solution and I hope this is more flexible and more efficient than trying to hard-code long column-lists with multiple replaces in a join. I didn't test it for scalability so before using it on very large tables it would need some thorough testing and performance tuning.

First some sample data.



--This represents a source table with columns where we want
--to look for matches in other columns in the same row. If we
--want to compare two different tables the relevant columns
--could be consolidated into one table like this example.

IF OBJECT_ID('tempdb..#SourceTable') IS NOT NULL
DROP TABLE #SourceTable

CREATE TABLE #SourceTable
(
id INT IDENTITY(1,1) NOT NULL
,ColA VARCHAR(10)
,ColB VARCHAR(10)
,ColC VARCHAR(10)
,ColD VARCHAR(10)
,ColE VARCHAR(10)
,ColF VARCHAR(10)
,ColG VARCHAR(10)
,ColH VARCHAR(10)
,ColI VARCHAR(10)
,ColJ VARCHAR(10)
,PRIMARY KEY (id)
);

INSERT INTO #SourceTable
SELECT 'Alpha','Bravo','Alpha','Zulu','Alpha','Bravo','Alpha','Papa','Quebec','Romeo'
UNION ALL
SELECT 'Delta','Echo','Foxtrot','Zulu','Delta','Echo','Foxtrot','Papa','Romeo','India'
UNION ALL
SELECT 'Charlie','Bravo','Alpha','Zulu','Charlie','Bravo','Alpha','Papa','Quebec','Delta'
UNION ALL
SELECT 'Xray','Yankee','Zulu','Zulu','Xray','Yankee','Zulu','Papa','Hotel','Hotel'
UNION ALL
SELECT 'Alpha','Bravo','Alpha','Zulu','Alpha','Bravo','Alpha','Papa','Hotel','Romeo'
UNION ALL
SELECT 'Delta','Golf','Delta','Zulu','Foxtrot','Golf','Echo','Hotel','Quebec','India'
UNION ALL
SELECT '','','','Zulu','Alpha','Bravo','Alpha','Papa','Quebec','Romeo'
UNION ALL
SELECT 'Delta','Golf','Delta','Zulu','','','','Papa','Quebec','Romeo';




Now create a procedure to do the matching:



CREATE PROCEDURE dbo.uspGetColumnMatch

@TableKeys VARCHAR(8000)
,@XrefKeys VARCHAR(8000)
,@isMatch BIT = 1

AS
BEGIN

SET NOCOUNT ON

/* For the data based on the table key cols */

IF OBJECT_ID('tempdb..#TableKeys') IS NOT NULL
DROP TABLE #TableKeys

CREATE TABLE #TableKeys
(
id INT IDENTITY(1,1) NOT NULL
,RefKeys VARCHAR(8000)
,PRIMARY KEY (id)
)

/* For the data based on the xref key cols */

IF OBJECT_ID('tempdb..#XrefKeys') IS NOT NULL
DROP TABLE #XrefKeys

CREATE TABLE #XrefKeys
(
id INT IDENTITY(1,1) NOT NULL
,RefKeys VARCHAR(8000)
,PRIMARY KEY (id)
)


DECLARE
@strTableKeys VARCHAR(8000)
,@strXrefKeys VARCHAR(8000)
,@sqlTableKeys VARCHAR(8000)
,@sqlXrefKeys VARCHAR(8000)
,@strValues1 VARCHAR(8000)
,@strValues2 VARCHAR(8000)
,@Delimiter1 VARCHAR(10)
,@Delimiter2 VARCHAR(10)


/* The delimiters are set to variables so they are easy to change */

SET @Delimiter1 = '|'
SET @Delimiter2 = '|'


/* Use the key columns to populate the temp tables */

SELECT
@strTableKeys = COALESCE(@strTableKeys + ',','')
+ CAST(dsk1.Item AS VARCHAR(8000))
FROM
dbo.DelimitedSplit8K(@TableKeys,',') AS dsk1

SELECT
@strXrefKeys = COALESCE(@strXrefKeys + ',','')
+ CAST(dsk2.Item AS VARCHAR(8000))
FROM
dbo.DelimitedSplit8K(@XrefKeys,',') AS dsk2


SET @sqlTableKeys =
'INSERT INTO #TableKeys (RefKeys) '+
'SELECT RTRIM(LTRIM(REPLACE('+REPLACE(@strTableKeys,',','+'',''+')+','' '',''''))) FROM #SourceTable'
EXEC(@sqlTableKeys)

SET @sqlXrefKeys =
'INSERT INTO #XrefKeys (RefKeys) '+
'SELECT RTRIM(LTRIM(REPLACE('+REPLACE(@strXrefKeys,',','+'',''+')+','' '',''''))) FROM #SourceTable'
EXEC(@sqlXrefKeys)


/* Create a two-dimensional array (delimited list) */

SELECT
@strValues1 = COALESCE(@strValues1 + @Delimiter1,'')
+ CAST(RefKeys AS VARCHAR(8000))
FROM
#TableKeys

SELECT
@strValues2 = COALESCE(@strValues2 + @Delimiter2,'')
+ CAST(RefKeys AS VARCHAR(8000))
FROM
#XrefKeys


/* Check for matches */

SELECT
dsk1.ItemNumber
,dsk1.Item AS Item1
,dsk2.Item AS Item2
,(CASE
WHEN dsk1.Item = dsk2.Item THEN 1
ELSE 0
END)
AS isMatch
FROM
dbo.DelimitedSplit8K(@strValues1,@Delimiter1) AS dsk1
CROSS APPLY
dbo.DelimitedSplit8K(@strValues2,@Delimiter2) AS dsk2
WHERE
dsk1.ItemNumber = dsk2.ItemNumber
AND ((@isMatch = 1 AND dsk1.Item = dsk2.Item)
OR
(@isMatch = 0))

END




Now you can look for matches in any crazy column order you can think of for an unlimited (well, at least up to 8K) number of columns. I threw in a flag to show or hide the unmatched columns.



EXEC dbo.uspGetColumnMatch
'ColA,ColB,ColC,ColC'
,'ColE,ColF,ColE,ColG'
,1

EXEC dbo.uspGetColumnMatch
'ColA,ColH,ColI,ColJ'
,'ColG,ColH,ColI,ColJ'
,0

EXEC dbo.uspGetColumnMatch
'ColA,ColB,ColC,ColD,ColE,ColF,ColI'
,'ColE,ColF,ColG,ColD,ColA,ColB,ColJ'
,0



Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search