Those of you who follow me on Instagram, Twitter, or Facebook know I’m taking most of August off for vacation. Erika and I spent the last couple of weeks in Telluride, had a brief stop in Sacramento for SQL Saturday, and this week we’re touring wineries in Napa and Sonoma.
As part of recharging, I caught up on a bunch of research papers that I’d set aside to read, and I realized I should share my favorites with you. For each paper, I’m going to try to boil down why I found it interesting, and why you might find it interesting too.
Let’s start with a paper that finds a new solution to a really classic problem.
As long as I’ve been working playing with databases, execution plans have had one giant pain point: parameter sniffing. I constantly run into situations where the slightest change of a parameter can completely rework an entire execution plan into something wildly different.
In the Microsoft Research paper Plan Stitch: Harnessing the Best of Many Plans by Bailu Ding, Sudipto Das, Wentao Wu, Surajit Chaudhuri, and Vivek Narasayya (2018), the authors propose something really cool: watching an execution plan change over time, splitting it into parts, and then combining parts to form a SuperPlan™. (They don’t call it a SuperPlan™ – it just seems appropriate, right?)
They even went so far as to actually implement Plan Stitch in SQL Server. I hope you’re sitting down:
We implement Plan Stitch in C# as a component on top of Microsoft SQL Server without requiring any changes to the engine. SQL Server stores previously-executed plans and their execution statistics, including plan structures, properties of operators (e.g., sort columns), and operator-level execution costs [37, 41]. Plan Stitch passively monitors query executions and triggers the search for alternative plans when execution statistics for multiple plans for the same query are available. Our implementation follows the logical architecture in Figure 2, where we force the stitched plan using the plan hinting API supported in SQL Server [36, 46].
Footnote 37 refers to Query Store, and footnote 36 refers to plan guides.
This is completely wonderful and I am mesmerized. When “Automatic Tuning” came out, I was infuriated at the ridiculous use of that term to describe what that feature actually did (simple plan regression.) This, THIS, ladies and gentlemen, THIS is automatic tuning. This is what that phrase should be applied to. So the only question becomes – when this feature eventually ships, what is marketing going to call it? Hekatuning?
If you liked that white paper, you’re going to love these:
- Leveraging Re-costing for Online Optimization of Parameterized Queries with Guarantees – saving multiple query plans for a parameter-sensitive query.
- Cosette: Automated Reasoning of Database Queries – if you’re going to build something to rewrite user queries (like the Database Tuning Advisor, but for query rewrites rather than index changes), then you need something to validate that a modified query produces the same results. That’s Cosette.
- Froid and the relational database query quandry – a nicely transcribed interview with Dr. Karthik Ramachandra.
- Cuttlefish: A Lightweight Primitive for Adaptive Query Processing – Microsoft Research prototyped this in Apache Spark, but it’s still neat to see because it’s relevant.
- Query and Resource Optimization: Bridging the Gap – if your database is hosted in a Kubernetes cluster with constantly changing resources, then your query plan needs to be aware of that. Today, it’s not.
And for the record, yes, I consider writing blog posts like this to still be vacation, hahaha. I banged out all 5 of this week’s posts while sitting poolside with drinks. (That last word is definitely plural.)
5 Comments. Leave new
Its a hard life Brent, but thankfully you are there to do it for us. 😀
Seriously though, I am utterly poleaxed by that Plan Stitch thing. That is exactly what I wanted for Automatic Tuning too.
The underlying architecture and methodology also feels like an extensible approach that could provide an avenue for additional optimizations to be applied. A low overhead approach to the server itself analyzing the piles of data provided over time and taking contextual actions.
Mindblowingly spiffy!
Hahaha, live to serve.
°?°
Well, that didn’t work. Imagine a surprised pikachu.
[…] Brent Ozar reviews a Microsoft Research paper: […]