• I've worked with many different reporting tools including SSRS, and they all try and hide the SQL away and encourage you to do grouping and calculations within the report. The idea is good, for users that understand excel-like formulas who want to focus on design and layout, it makes sense. Some tools (e.g. MicroStrategy) even generate SQL for you, sending the grouping and calcs back to the database, which is probably the only way you would get any semblance of performance by working entirely report-side.

    This is a fine way to operate on small datasets, but it absolutely doesn't scale. The main issues with connecting your report to a huge dataset are

    1) the volume of data that is transferred over the wire from the database to the reporting tool

    2) the difficulty with rendering huge numbers of rows in a browser

    This applies to every reporting tool, Tableau, qlik, ssrs, power BI, whatever.

    As Jeff pointed out, optimizing the query on the database side means firstly you are taking advantage of the faster database query engine, secondly sending fewer rows over the wire. That equals faster report render and solves the (1) issue above. The (2) issue above might go hidden if you always start designing your report with groups and filters prior to running it for the first time. I find people don't do that. They connect to the million rows (even 5000 is too many) and try and run the details section (i.e. no groups) and wonder why it falls over and the browser hangs. It's why Tableau will throw a LIMIT or a TOP in the query while in design mode, and why ssrs has pagination on by default.

    There is of course though, the concept of drilling, drill down, drill across, or slicing and dicing. Many specs for dashboards call for flexible parameters and the ability for users to drill down to more granular data, while not waiting for the report to reload. If you've done all your grouping database side, then you sacrifice the flexibility of being able to change that grouping on the fly, without doing another round trip to the database, potentially waiting for the report to reload or respond.

    If you think about a ratio calculation, or a percent of total calculation in a report syntax being equivalent to using a variation of a sum() OVER window function in SQL, then what the report calc gives you is actually more flexible. The partition by bit is not hard coded, it's relative to the current context of the report. You can change some drop-down or other control on the report and effectively change the grouping and the % total calc will, in SQL equivalence, adjust itself to be OVER the appropriate partition.

    The downside is that having "ajax-like" ( i.e. no page reloading) page response whilst having full parameter flexibility implies returning a huge dataset to the report. I think that should be discouraged as much as possible, as you are back to problem (1).

    There are things that reports can do that an optimized query that returns a single aggregated dataset doesn't cater for. It restricts flexibility. On the other hand total flexibility is going to lead to extremely slow performance. A report needs to be a compromise between those extremes. User experience is king.

    There are a few good strategies.

    Take the user on a journey through the data, start out fast and aggregated and high level, and then let the user go to the slow places if they need to. They will understand why it's slower.

    Take for example measuring cpu usage on a server. You don't really need a second by second breakdown as an initial report view. It's too much data (86400 data points in a day) and it's too noisy. You might want to see max per hour in a day. That's down to 24 data points. From there if you spot an hour with high usage, zoom into (drill down) that hour and see if there is an issue. It's a simple example, and flawed, but explains my point.

    Detect the user, and prefilter the granular data to what is relevant to them if possible. E.g. If they are sales manager for a single region, filter to the region. Report services has a system variable that you can interrogate and determine who is currently running the report. Assuming windows authentication, you can do some magic right there. Personalized, custom reports are pretty impressive.

    One of my favourites is the exception report. Query ahead of time for the top 10, the bottom 10, the statistically significant, the anomalies, the failures, and present those in the report. After all that's what people are often doing with the report, looking for weirdness. Make their jobs easier. If they are an analyst, or they just want to dump the data, then why even create the report? Of course there are legit reasons for that. Financial reports, legally defensible documents etc. But for the most part, if users just want to dump out the data and import it into Excel or a different reporting tool, then what they need is a stored proc or a web service, not a report.

    Finally, if you're a report writer, take some time with SQL, don't be put off by syntax that is new to you or looks complicated. Play around with it. CTE's are just an upfront way of writing subqueries with a few advantages - apart from recursive ones which don't have many advantages! "cascading CTE" just means it is like a bunch of nested subqueries, but much more clearly laid out. Window functions are just ways of getting to data that's not part of the current row being returned. You're already doing that same thing in the report calculations. OVER is amazing, really.