RE: Datatype performance - in theory there should be a difference, but in practice is there?

SSC Guru

Points: 1003872

September 25, 2016 at 6:40 pm

#1903005

I can't speak for anyone else but here are my thoughts and a couple or personal observations on the subject.

On the original subject of "does it really matter", the answer is a very profound "YES"!. When I first reported to my current job, there were multiple 10 minute timeouts each day due to "system paralysis", screen performance was greater than 5 seconds (sometimes much greater), and about half of the problems were solved just by paying attention to data types and making them match.

I agree that, by itself, fixing a 100 ms proc or query so that it runs in less than 10 ms doesn't sound like much until you realize that tens of thousands of them exist in query cache and they're all used every day sometimes millions of times each in just 8 hours. Then realize that the key to solving many of the blocking issues is to "just" make things run faster.

Yes, today's hardware is a bloody miracle. "Just" by changing to a new server with some fancy SSDs, faster hard drives, and doubling the number of CPUs, we were able to double the performance of most code. Heh... as my grand kids might say, "Big wow, what a let down". That still pales in comparison to what we were able to do by paying attention to data type matching and making code SARGable, which is frequently dependent on data type matching. In many cases, the 7-800X improvement that Eirikur mentioned, which is very significant, was small change compared to some improvements that we realized, which resulted in thousands of times improvement with a couple of rare birds that experienced nearly a million times improvement.

And no... it wasn't just data type matching that we did. That helped a lot but it's not a panacea. It's only the first necessary step in a much larger picture but, if you leave it out, the much larger picture doesn't have much of a chance to be resolved.

Ignore read reduction if you want. I won't. We've reduced reads in the normal 8 hour business day to the tune of 150 Tera Bytes of IO just by paying attention to mostly small stuff like data type matching and SARGability. Blocking is now virtually non-existent in the system, system "paralysis" outages are unheard of now, and most screen times are sub-second instead of over 5 seconds. The only ones that aren't are for reporting purposes and we've drastically reduced those times as well.

The really bad part of it all was that it only took the previous regime of "expert 'developers'" less than a year to mess all of that up with incorrect and mismatched data types. It has taken us years to fix it because they also screwed the front end down on the same kinds of problems. Make a change, fix several things it affected and some of those fixes affected other things that needed to be changed and over and over and over...

Heh... and then people have the nerve to say it's the fault of the database.

When considering such "small" things like data type matching and "right sizing", you might think you're working on a grain of sand. The problem is that if all the grains of sand have the same problem, then you have a pretty well messed up beach to fix one grain of sand at a time. To borrow and warp a phrase, your really should sweat the small stuff and... it's all small stuff. :w00t: You can't afford to get the small stuff wrong.

Rant complete... regard all further alarms.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)