Excellent article well explained!
I have a couple of performance-related comments and suggestions which may be of interest if consistent latency is paramount in your user-base.
Never underestimate the effects of parameter-sniffing when adopting a parameterised stored procedure to feed your dataset(s)! A query plan persisted in cache generated by one parameter may be wildly inappropriate for a different parameter. If this is true of your data or query, your report which returned in four seconds when you tested it may suddenly experience unacceptably high latency when run with different parameters.
If you're unfamiliar with the concept of parameter-sniffing, I direct you to the numerous articles on parameter-sniffing readily available on the web (Brent's Elephant and Mouse offers a good intro: http://www.brentozar.com/archive/2013/06/the-elephant-and-the-mouse-or-parameter-sniffing-in-sql-server/).
So your report which ran in three seconds with an argument of "Luxembourg" is taking four minutes (and counting) for "USA".
Should you resort to "recompile" or "optimise for" hints to combat parameter-sniffing in this scenario? Perhaps, but before you do, consider removing the parameter from the stored procedure altogether and placing it over the dataset as a filter instead i.e. grab all the data from the database but present just the bits relating to the selected parameter in the report. It might seem like overkill, but remember that:
(a) the data may still exist in cache. If not,
(b) a parallel plan aggregating results for all parameter values might well execute in roughly the same time as a serial plan for a single value - can you rewrite your procedure and force a parallel plan? (see articles on "parallel plans with cross apply" for more info)
(c) if the parameter's domain is dense, the volume of additional aggregated data being returned from the database may be insignificant
Whether this technique is effective - or even possible - in your scenario will depend on many things: the size, shape and density of your data, the type of indexes present, the volatility of your plan and data caches, the types of operations being performed in the query e.g. sum, count, avg, min, dense_rank etc, to name but a few.
Personally, when I've experienced dramatic effects of parameter-sniffing, this method has proved successful at consistently achieving acceptable latency for all parameter arguments when my dataset is highly aggregated and can be filtered after aggregation (as per the HAVING clause).
As in all-things-engineering there is no "right way". Evaluate the impact this method has on your scenarios and decide for yourself.
That said, thanks again to Adam for his great article!