Many people, when examining the execution plan for a malfunctioning query, tend to focus largely on the usual set of "suspect" operators, such as Lookups, Spools, Sorts, and Index Scans (rather than Seeks), and then use a mixture of indexing and query rewriting to try to make the problem go away. Some success can be had with this approach, but it tends to fare less well when one is confronted with really knotty problems, such as the erratically performing query; the one that performs admirably within SSMS but terribly when executed from the application, or the one that having caused no previous trouble suddenly starts to misbehave.
What I came away with, from attending several execution plan-related sessions at the recent PASS summit, from the likes of Grant Fritchey, Gail Shaw, Adam Machanic and Klaus Aschenbrenner was a renewed appreciation of, firstly, how often erratic or unpredictable query performance is related to issues surrounding data skew, statistics and parameter sniffing, and secondly, how much information regarding these issues is available directly from the execution plan, but tucked away in the Property sheets, and behind seemingly "trivial" operators, such as SELECT, to which I'd previously paid little attention.
Did you know, for example, that the properties for the SELECT operator of an exertion plan reports a "Reason for early termination of statement optimization"? If you see the value "Time out" here, it's a cause for concern, as it means the optimizer failed to find a "good enough" plan in the time allotted.
Also hiding in the properties of the SELECT operator, are the exact values of the SET options used for the query execution. Differing values for the various SET options, between execution environments is a classic cause of procedure that has a good plan in SSMS, but a bad plan when executed from a client app. If the same procedure is executed with different SET options then any existing plan for that procedure will not be reused. Instead a new plan will be generated, and if the parameter values sniffed this time happen to be highly non-representative of "normal" execution, then you'll get a bad plan and poor performance. Just such an issue is reported in this issue of Database Weekly. In this case, the author tracked the problem down using DMVs and plan handles, but in fact, you can compare the SET options directly, by generating the actual execution plan in each case, and reading the values from the SELECT operator.
Here's another little piece of information behind the SELECT operator of which I was previously unaware: Memory grant. Let's say we have a hash or sort operation that the optimizer predicts will work on 5 rows, based on available statistics. A correspondingly-small amount of memory will be granted to perform that operation, and you'll see this memory grant in the SELECT operator of the execution plan. If it turns out that the stats are inaccurate, and the operation actually returns 500 rows, it can't suddenly request more memory, and so the operation will simply spill to disk, in tempdb. It's worth capturing actual execution plans (as opposed to estimated, which is what's stored in the plan cache), and keeping an eye on wild disparities between the estimated and actual number of rows returned by a given operation. It's likely that outdated stats are the cause; they are only auto-updated when" 20% + 500 rows" of the data in a table have changed. For tables with millions of rows, this means that data skew can change drastically before the stats update automatically.
It takes a while to master the art of reading and understanding execution plans, to know exactly where to look for the information, and to understand not only what "dodgy" operators to look out for, but also the 'hidden' properties of each operator that will help you understand the flow of data through the plan. If it's true that it takes 10,000 hours of training to master SQL Server, then you probably need to invest several tens of those hours mastering execution plans.