Self Join

Question

Post reply

Self Join

BSavoie

SSCertifiable

Points: 6309
More actions
December 24, 2011 at 8:05 pm

#253930

I'm working on a report. I need the details, and summary from one table basically. I'm using a self join query. Doesn't perform real well, but it certainly does what I need. Of course the redundant data is poor database design, but this is just a report. So the report is based on a "select" like the one at the end of the code. Is there a better way to do this sort of thing? Here's the code:
declare @transactions table (
TranId int not null primary key identity(1,1)
,TranCode varchar(4) not null
,TranDescription varchar(30) null
,TranAmount money
)
insert into @transactions(TranCode, TranDescription, TranAmount)
values('100', 'REST', 100.00)
insert into @transactions(TranCode, TranDescription, TranAmount)
values('100', 'REST', 100.00)
insert into @transactions(TranCode, TranDescription, TranAmount)
values('100', 'REST', 100.00)
insert into @transactions(TranCode, TranDescription, TranAmount)
values('100', 'REST', 100.00)
insert into @transactions(TranCode, TranDescription, TranAmount)
values('200', 'BAR', 200.00)
insert into @transactions(TranCode, TranDescription, TranAmount)
values('200', 'BAR', 120.00)
insert into @transactions(TranCode, TranDescription, TranAmount)
values('200', 'BAR', 160.00)
insert into @transactions(TranCode, TranDescription, TranAmount)
values('200', 'BAR', 9.00)
insert into @transactions(TranCode, TranDescription, TranAmount)
values('200', 'BAR', 45.00)
insert into @transactions(TranCode, TranDescription, TranAmount)
values('300', 'GIFT', 78.00)
insert into @transactions(TranCode, TranDescription, TranAmount)
values('300', 'GIFT', 82.00)
insert into @transactions(TranCode, TranDescription, TranAmount)
values('300', 'GIFT', 35.00)
insert into @transactions(TranCode, TranDescription, TranAmount)
values('300', 'GIFT', 100.00)
insert into @transactions(TranCode, TranDescription, TranAmount)
values('300', 'GIFT', 120.00)
insert into @transactions(TranCode, TranDescription, TranAmount)
values('300', 'GIFT', 72.00)
insert into @transactions(TranCode, TranDescription, TranAmount)
values('300', 'GIFT', 66.00)
select t.TranCode, t.TranDescription, t.TranDescription, TranTotals.TranCodeTotal
from @transactions t
left join (
select TranCode, SUM(TranAmount) as TranCodeTotal
from @transactions
group by TranCode
) as TranTotals on t.TranCode = TranTotals.TranCode
order by TranCode
.

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply

RowanCollum SSC Veteran Points: 224 More actions · Answer 1

You could use a SUM Window function as an alternative...

select

TranCode

,TranDescription

,sum(TranAmount) over (partition by TranCode) as TranCodeTotal

from @transactions

order by TranCode

BSavoie SSCertifiable Points: 6309 More actions · Answer 2

Thanks for the alternative. Hmm, sure is cleaner syntax wise. Interesting thing about it is that if I understand the execution plan correctly, it takes twice as long. Rather shocking! I would expect it to be virtually identical.

When I look at the execution plan, the self join takes %4 of the total batch. The "window" version takes 8% of the batch. I'm not realy great at reading execution plans, but that seems like twice as long.

.

okbangas SSChampion Points: 11773 More actions · Answer 3

For this kind of analysis, you either have to look closely on the execution plan, or have a relevant amount of test data. If you look at the execution plan, you'll see that the self join has two clustered index scans against your table. While this may be more efficient with a few rows, it is definitely not the most efficient when you get millions or billions of rows. So, for larger amounts of data, I do believe that the windowing approach is more efficient.

Ole Kristian Velstadbråten Bangås - Virinco - Facebook - Twitter

Concatenating Row Values in Transact-SQL[/url]

BSavoie SSCertifiable Points: 6309 More actions · Answer 4

Makes good sense. I'll have to study the plan. I would think the window and the self join end up with the same plan. Probably a good lesson to be learned here.

Thanks and Happy Holidays.

.

tfifield SSCrazy Eights Points: 9655 More actions · Answer 5

Don't be fooled by the % of total batch numbers in the execution plan. 8% of a batch that runs in 20 ms is much better than 4% of a batch that runs in 500 ms. You have to play these numbers against CPU and logical reads.

You should play the execution plan against the numbers you see from SET STATISTICS IO, TIME ON.

Todd Fifield

brendan woulfe Hall of Fame Points: 3888 More actions · Answer 6

I agree with Todd, the cost percentages by themselves aren't that useful. You'll have to compare IO, cpu, and duration between the two queries with varying amounts of data to see which is the more efficient for your purposes.

GSquared SSC Guru Points: 260824 More actions · Answer 7

Costs in execution plans are, by and large, junk data. Use the I/O and time stats instead.

Another option would be to Outer Apply the sub-query, but that's likely to get the same execution plan as the Outer Join.

Are you, in the real query, running on a table variable, or on a "real" table? That'll have a huge impact on the execution plan and the overall speed, in most cases. (Table variables reduce the "cost", but usually actually take more work and time to run. One of the reasons "costs" are largely a junk value.)

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

BSavoie SSCertifiable Points: 6309 More actions · Answer 8

Yea, I've come to that conclusion. I was trying to use that stat for a simple shortcut in this case. Should have known better!

Thanks all!

.