RE: Divide and Conquer - Performance Tuning

SSC Veteran

Points: 223

April 11, 2014 at 3:41 pm

Sorry for the late reply but it's been a very busy week for me. Finally got time to look at these posts and it's always great to see people thinking about how to improve solutions as that's where you get a broader spectrum of knowledge from.

In response to Post #1559754.

1. I would prefer apples, pineapples or a tin of tropical fruit and I know which one I prefer landing on my head powered by gravity alone 😉

2. You spotted one of the further optimisations I left out as the article was getting quite large but it's not the distinct that was causing the issue, it was the rank function on duplicate rows and by moving that to the outer query, as you showed, you gain a little more performance. Well spotted.

The lookup table query changes to the following to get this performance increase.

insert into #lookup (rate_code,combo)

select q.rate_code, power(2, row_number() over (order by q.rate_code) - 1)

from (select distinct rate_code from #worker_rate) q

3. While it's useful to be able to provide the results in a single query you lose that important index so that when the number of rate codes increases the performance drops as you pointed out in your second post. I just wish MS would put in a feature to post-attach indexes to temporary worktables doing away with the need to split the query.

So starting with 1 million rows your single query approach on my server (different server this time so we'll compare both on the same server) with the same three rate codes came out at

1,842ms CPU and 1,177ms elapsed time

The article example in its original form took

2,356ms CPU and 1,175ms elapsed time (after adding the two stages)

The article example with the enhanced lookup table code took

2,013ms CPU and 1,195ms elapsed time (after adding the two stages)

Increasing the number of rate codes (which will reduce the number of worker ids as we are still only taking the top 1 million rows) your single query takes

966ms CPU and 320ms elapsed time

The enhanced article dual stage query takes

889ms CPU and 397ms elapsed time

So great spot with the row_number instead of the dense_rank function but not quite enough performance boost to warrant switching to the single query in my opinion. The result output was changed to be non-contiguous also so if that was important it causes a problem. You are returning the count of the rates for each worker instead of the unique group showing that each worker in that group has the same rate codes.

grp which was 1,2,3 is changed to 3,13,14 and two workers with different rate codes but the same number of them will have the same group code.

It's always very important to check the output of any optimisation and luckily this utility does that for me so is quick to catch as you can see in the attached image if it gets attached.