November 2, 2017 at 4:33 am
Guys,
I need to join two tables together - Master and Detail
Master Table:ID =7000776601*1*6994
Code=P*W*S
Instances=6994ý12510ý12874ý13245ý13609ý13973ý14337ý14701ý15009ý15072ý15436ý15800ý16164ý16528ý16899
Details TableID = 7000776601*1*6994*P*W*S*15436
ChargeEffectiveDate = 15436ý15436ý15436ý15436ý15436ý15436ý15436ý15072
So the Details table key is the Master.ID + Master.Code + one of the values in Master.Instances
There are multiple details records (there should be one for each value in Instances but I haven't validated this)
There will always be the same number of elements in the compound key string, but the length of these items can change so I can't just do LEFT(23)
What is the best strategy for doing a 1:n match on these. The two strategies I have thought of ARE:
1) REVERSE() the details key then find the position of the 1st *, take the RIGHT() from the star and then REVERSE() back again but this will be non-SARGable and they are fairly big tables so a table scan is a big performance hit
2) run the Master.Instances through Jeff Moden's DelimitedSplit8K into a temp table, Join the Master to the TT on a 1:n JOIN and then join the TT to the Details on a 1:1 JOIN.
This uses more disk IO but is indexable
Is there any opinion on which is the best strategy or can you propose another?
FWIW I don't care about the instance, I just need to get all the Detail records for the Master.
November 2, 2017 at 5:00 am
This isn't what an RDBMS is designed for, so performance won't be great. But you can try this (feel free to correct any syntax errors that may have slipped in):
-- If elements in Instances are always five characters
ON Details.ID = Master.ID + Master.Code
AND Master.Instances LIKE '%' + RIGHT(Details.ID,5) + '%'
-- If elements in Instances are of variable length
ON Details.ID = Master.ID + Master.Code
AND Master.Instances LIKE '%' + RIGHT(Details.ID,LEN(Details.ID)-LEN(Master.ID+Master.Code)) + '%'
You might also experiment with PATINDEX instead of LIKE on the second line, to see which performs better.
John
November 2, 2017 at 5:08 am
This isn't going to be a pretty join no. I don't see any reason to split into a Temporary table though, do the job in one go:
As you can see, this does the JOIN in one parse, using Jeff's DelmitedSplit8K to separate the different parts out. This isn't going to be particularly quick on large tables though.
Thom~
Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
Larnu.uk
November 2, 2017 at 5:20 am
My thinking behind the Temp Table is that I can build a suitable index on it, otherwise we are likely to be doing table scans.
November 2, 2017 at 5:24 am
aaron.reese - Thursday, November 2, 2017 5:20 AMMy thinking behind the Temp Table is that I can build a suitable index on it, otherwise we are likely to be doing table scans.
But you'd need a table scan to get all the data into the temp table, wouldn't you, plus a scan of the temp table to build the index?
John
November 2, 2017 at 5:25 am
aaron.reese - Thursday, November 2, 2017 5:20 AMMy thinking behind the Temp Table is that I can build a suitable index on it, otherwise we are likely to be doing table scans.
You'll be doing that anyway when you put the data in the temporary table.
Thom~
Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
Larnu.uk
November 2, 2017 at 8:08 am
aaron.reese - Thursday, November 2, 2017 4:33 AMGuys,I need to join two tables together - Master and Detail
Master Table:
ID =7000776601*1*6994
Code=P*W*S
Instances=6994ý12510ý12874ý13245ý13609ý13973ý14337ý14701ý15009ý15072ý15436ý15800ý16164ý16528ý16899
Details TableID = 7000776601*1*6994*P*W*S*15436
ChargeEffectiveDate = 15436ý15436ý15436ý15436ý15436ý15436ý15436ý15072So the Details table key is the Master.ID + Master.Code + one of the values in Master.Instances
There are multiple details records (there should be one for each value in Instances but I haven't validated this)
There will always be the same number of elements in the compound key string, but the length of these items can change so I can't just do LEFT(23)What is the best strategy for doing a 1:n match on these. The two strategies I have thought of ARE:
1) REVERSE() the details key then find the position of the 1st *, take the RIGHT() from the star and then REVERSE() back again but this will be non-SARGable and they are fairly big tables so a table scan is a big performance hit
2) run the Master.Instances through Jeff Moden's DelimitedSplit8K into a temp table, Join the Master to the TT on a 1:n JOIN and then join the TT to the Details on a 1:1 JOIN.
This uses more disk IO but is indexableIs there any opinion on which is the best strategy or can you propose another?
FWIW I don't care about the instance, I just need to get all the Detail records for the Master.
You missed the third and best long term option. Normalize your tables. You're running into issues, because your tables aren't normalized. Those issues go away if you properly normalize your tables. I understand that you may be working with a third party product and not able to modify the table structure.
Drew
J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
November 3, 2017 at 6:41 am
drew.allen - Thursday, November 2, 2017 8:08 AMaaron.reese - Thursday, November 2, 2017 4:33 AMGuys,I need to join two tables together - Master and Detail
Master Table:
ID =7000776601*1*6994
Code=P*W*S
Instances=6994ý12510ý12874ý13245ý13609ý13973ý14337ý14701ý15009ý15072ý15436ý15800ý16164ý16528ý16899
Details TableID = 7000776601*1*6994*P*W*S*15436
ChargeEffectiveDate = 15436ý15436ý15436ý15436ý15436ý15436ý15436ý15072So the Details table key is the Master.ID + Master.Code + one of the values in Master.Instances
There are multiple details records (there should be one for each value in Instances but I haven't validated this)
There will always be the same number of elements in the compound key string, but the length of these items can change so I can't just do LEFT(23)What is the best strategy for doing a 1:n match on these. The two strategies I have thought of ARE:
1) REVERSE() the details key then find the position of the 1st *, take the RIGHT() from the star and then REVERSE() back again but this will be non-SARGable and they are fairly big tables so a table scan is a big performance hit
2) run the Master.Instances through Jeff Moden's DelimitedSplit8K into a temp table, Join the Master to the TT on a 1:n JOIN and then join the TT to the Details on a 1:1 JOIN.
This uses more disk IO but is indexableIs there any opinion on which is the best strategy or can you propose another?
FWIW I don't care about the instance, I just need to get all the Detail records for the Master.You missed the third and best long term option. Normalize your tables. You're running into issues, because your tables aren't normalized. Those issues go away if you properly normalize your tables. I understand that you may be working with a third party product and not able to modify the table structure.
Drew
Normalize my tables - Why didn't I think of that 😛
Unfortunately, not my tables 🙁
These are 3rd party application tables that are legacy from pre-RDBMS days. They are slowly migrating them to 3NF but currently work of the basis of If It Aint Broke...
November 6, 2017 at 12:53 pm
It seems as if you'd be scanning both tables anyway, since you're reading all the master and details (?).
Maybe just?:
FROM Details d
INNER JOIN Master m ON d.ID LIKE m.ID + '*' + m.Code + '%'
Hopefully (presumably?) both tables are clustered on ID, so you might be able to force a MERGE join without a sort being required if SQL doesn't generate such a plan on its own (then again, maybe not, as SQL must be ultra-cautious about using a MERGE join w/o a sort for fear of bad results).
Edit:
The Details table should have had a composite key of ( master_id, details_id ). That would make joining a breeze, but wouldn't give up any flexibility in the actual values in the keys.
SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.
November 7, 2017 at 1:13 pm
aaron.reese - Friday, November 3, 2017 6:41 AMdrew.allen - Thursday, November 2, 2017 8:08 AMaaron.reese - Thursday, November 2, 2017 4:33 AMGuys,I need to join two tables together - Master and Detail
Master Table:
ID =7000776601*1*6994
Code=P*W*S
Instances=6994ý12510ý12874ý13245ý13609ý13973ý14337ý14701ý15009ý15072ý15436ý15800ý16164ý16528ý16899
Details TableID = 7000776601*1*6994*P*W*S*15436
ChargeEffectiveDate = 15436ý15436ý15436ý15436ý15436ý15436ý15436ý15072So the Details table key is the Master.ID + Master.Code + one of the values in Master.Instances
There are multiple details records (there should be one for each value in Instances but I haven't validated this)
There will always be the same number of elements in the compound key string, but the length of these items can change so I can't just do LEFT(23)What is the best strategy for doing a 1:n match on these. The two strategies I have thought of ARE:
1) REVERSE() the details key then find the position of the 1st *, take the RIGHT() from the star and then REVERSE() back again but this will be non-SARGable and they are fairly big tables so a table scan is a big performance hit
2) run the Master.Instances through Jeff Moden's DelimitedSplit8K into a temp table, Join the Master to the TT on a 1:n JOIN and then join the TT to the Details on a 1:1 JOIN.
This uses more disk IO but is indexableIs there any opinion on which is the best strategy or can you propose another?
FWIW I don't care about the instance, I just need to get all the Detail records for the Master.You missed the third and best long term option. Normalize your tables. You're running into issues, because your tables aren't normalized. Those issues go away if you properly normalize your tables. I understand that you may be working with a third party product and not able to modify the table structure.
Drew
Normalize my tables - Why didn't I think of that 😛
Unfortunately, not my tables 🙁
These are 3rd party application tables that are legacy from pre-RDBMS days. They are slowly migrating them to 3NF but currently work of the basis of If It Aint Broke...
Ummm... I'm pretty sure it actually IS broke. That kind of database with any kind of data volume = recipe for disaster. It wouldn't take much to totally ruin this database.
Steve (aka sgmunson) 🙂 🙂 🙂
Rent Servers for Income (picks and shovels strategy)
November 8, 2017 at 2:28 am
I would agree that it is broke, but its not as broke as some other parts of the system 🙂
In reality most of the data is historical and therefore the poor design does not impact significantly on day-to-day performance. When they get round to migrating it, I hope they also partition the data.
Viewing 11 posts - 1 through 10 (of 10 total)
You must be logged in to reply to this topic. Login to reply