I'm right there with you, Gail. I've seen so many posts and articles on this and other sites that claim one performance fact or another and a great many of them are flat out wrong. I've also seen people that post execution plans as "proof of performance" based on 10 rows. Neither is good. There has to be a run, it has to be measured, and the correct tools have to be used to do the measurements (for example, never use SET STATISTICS to measure performance of a scalar or multi-statement UDF).
You points about datatypes are also spot on. There have been a number of posts and articles that claim that (for example) a While loop beats certain set-based techniques and, even with supposed "proof" in the form of large dataset testing, the experiment itself was flawed and weighed heavily in favor of the While loop because of datatype mismatches.
I'll also add that one cannot just say "I tested it and here are the results". If we all believed in such claims, we'd all believe that cold fusion was a reality. The test absolutely MUST be repeatable by others and that means the data, the code, and the test harness MUST be published.
To wit, the words "It's a best practice to..." are some of my least favorite words. My take on those words is frequently "Just because a million people are doing it, doesn't make it a best practice because they could all be doing it wrong". A great example of that can be found in the posts that use a Recursive CTE that increments some value. It's frequently slower and more resource intensive than a While loop yet there are hundreds of posts that outright claim or strongly infer that it's a best practice especially if there are "only a small number of rows" (one of my other least favorite expressions).
Shifting gears a bit, one of the greatest advantages of using the scientific method and posting the proof in the form of runnable code is that you can learn a lot from the community. For example, I wrote an article about a method to remove duplicated spaces that blew the doors off several While loop methods. Someone else pointed out the nested REPLACE method that blew the doors off my method in the article and then proved it with the very same million row test table that I'd created for the article.
Yeah... I'm with you... "Cite it or prove it" and if you can't prove what's been cited (we all know that everything on the internet, especially MS articles and articles written by MVPs and other experts, are always 100% correct, right? :sick:), then it's time to get busy and make a real proof. So I'll modify the "Cite it or prove it" and add that some good bit of research should be done before one cites something written by one of the proverbial cool kids.
Yes, there are certain people out there that I have an inherent trust for in what they write. For example, Paul Randal's and Kimberly Tripp's articles are some of the most trustworthy articles out there and it's not because they were former MS employee's or wrote parts of SQL Server(although that does help). It's because most of their articles PROVE what they have said and they've coughed extensive information on how the readers can prove it to themselves. Hear-say does not constitute a fact or best practice no matter how often it's repeated.
is pronounced "ree-bar
" and is a "Modenism
" for R
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
"Change is inevitable... change for the better is not".
"If "pre-optimization" is the root of all evil, then what does the resulting no optimization lead to?"
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)