Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase «««910111213»»

Parsing Parameters in a Stored Procedure Expand / Collapse
Author
Message
Posted Saturday, November 3, 2012 10:57 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Monday, March 31, 2014 3:37 PM
Points: 3, Visits: 55
Jeff, This article is about calling a stored procedure with multiple parameters not for retrieving data from a column with values separated by comma. Do you really believe that it matters if one call with few comma separated values will take 30 milliseconds or 100% more (60 milliseconds)? I agree that there are ways to optimize the parsing methods, but be real, in a year of use you will save 5 seconds processor time and you will spend 1 hour to develop it.
Post #1380725
Posted Saturday, November 3, 2012 1:08 PM


Right there with Babe

Right there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with Babe

Group: General Forum Members
Last Login: Tuesday, July 22, 2014 10:23 PM
Points: 795, Visits: 1,548
Jeff Moden (8/11/2010)

If the data is worth having, it's worth validating.


I'm gonna print that out and stick it on the wall of my office and make people read it to me every time they ask me to write a report that will magically filter out or "correct" all the junk data entered into [insert shoddy application here].




One of the symptoms of an approaching nervous breakdown is the belief that one's work is terribly important.
Bertrand Russell
Post #1380739
Posted Saturday, November 3, 2012 5:35 PM
Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Thursday, June 5, 2014 8:23 PM
Points: 363, Visits: 358
Peter Di (11/3/2012)
Jeff, This article is about calling a stored procedure with multiple parameters not for retrieving data from a column with values separated by comma. Do you really believe that it matters if one call with few comma separated values will take 30 milliseconds or 100% more (60 milliseconds)? I agree that there are ways to optimize the parsing methods, but be real, in a year of use you will save 5 seconds processor time and you will spend 1 hour to develop it.

I had precisely that line from a colleague of mine. About a procedure doing just such a simple parameter strip.
Sounds like a compelling argument? Not when the procedure was being run a couple million times per day.
And yes, the greater performance issue in this case was poor application design, that it was being run more frequently than necessary. But the point is that even micro-performance counts, depending on the context.



Post #1380753
Posted Saturday, November 3, 2012 8:31 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 8:22 AM
Points: 36,767, Visits: 31,223
Peter Di (11/3/2012)
Jeff, This article is about calling a stored procedure with multiple parameters not for retrieving data from a column with values separated by comma. Do you really believe that it matters if one call with few comma separated values will take 30 milliseconds or 100% more (60 milliseconds)? I agree that there are ways to optimize the parsing methods, but be real, in a year of use you will save 5 seconds processor time and you will spend 1 hour to develop it.


Very good questions, Peter. Seriously. And, apologies for the long winded answer but good questions deserve thoughtful answers.

First, my objections weren't just about time/duration. There was nothing in the code to identify when one or more elements of the passed parameter weren't in the comparison table. The design was missing the necessary validation and feedback. That, of course, could be easily fixed but that would also eat into the hour you're talking about. If it wasn't found, it could cost much more than time as a missed error depending on what it was actually doing.

Shifting gears back to the focus of the time related questions.

I agree. One call with a few comma separated values against a relatively tiny table that takes 60 or even 600 millseconds isn't going to matter in the grand scheme of an application... until it's no longer one call. If we take the lower number of just 30 milliseconds and put it up against a higher usage application where it might be used a thousand times per second, we're suddenly talking about 30 seconds of CPU time per second. In broader terms, that will take 30 minutes of CPU time for every hour for this simple task when it should take nearly zero minutes per hour. If the method is used in many places in the app for each call, you're suddenly talking about a whole lot of CPU time being used for something so simple. Yeah, I know. "Hardware is cheap". We'll talk about that at the end of this.

Ok... agreed. Not everyone is going to have that frequency (1.000 times per second) of usage for an app and it might, in fact, only be used once in the app. As you correctly point out (I'm right there with you on this), is it really worth spending an hour on to make it take only a couple of milliseconds (hopefully, less) instead of 30? The answer is patently "No" but not quite the way you might be thinking. Developers shouldn't have to spend but a minute or two on something like this because they should already know that nearly every app is going to need to handle such things and, unless they're a rank newbie, they should have already studied what the best methods are (combination of 100% accuracy and excellent performance) and have it ready to copy and paste from their library. I guess that's a personal gripe of mine. A lot of people don't study for the job they're supposed to be doing.

Now, let's get to the more insidious side of this. Low usage, low row counts, and supposedly low frequency should never be used as an excuse for "good enough" code for several reasons. First and like I pointed out, a Developer should already know what the fast stuff really is and have it at the ready for CPR (Copy, Paste, Replace). Second, there are those developers who haven't taken quite that interest in their job so when they're hit with a tight schedule, they'll use anything they can find so long as it works on one to ten rows (or calls) never giving consideration to the fact that their work will have to scale.

I just went through this at work, again, with code very similar to that in the article but backwards in flow of data. Someone wrote some code against a table that everyone "knew" wasn't going to grow by much. The table was supposed to have 28 rows in it and it wasn't likely that it would ever even double. They were right. After 2 years, it should have only 51 rows in it. The trouble is that they didn't want to maintain the table manually so as the number of possible selections started to grow, they used the very same method they previously used (to save development time)but against a data table. When I discovered the problem, the data table had grown to 4.7 million rows and they had "thoughtfully" added a DISTINCT to create the lookup list with a UNION against the original table. The code was only used 1,000 times per hour (low frequency, right?) but it had grown (to >2 seconds per call) to using a total of 40 CPU minutes and more than an eighth of a Terabyte in reads per hour... to return just 51 rows for a drill down menu that was only used 1,000 times per hour.

Like I said, it had grown to taking slightly more than 2 seconds to return. Let's see… it gets used by our internal people 1,000 times per hour times 12 hours per day (extended coverage phone center), every 21.7 working days of the month. That's over 144 FTE hours per month that we're paying people to wait for a pulldown menu to load. Considering nearly linear growth of the data table over the two years the code had existed, that's 1,800 hours of wait time we paid people for. Even if a Developer wasn't prepared to handle this eventuality, that's a hell of a trade off compared to the 1 hour of development time you were talking about.

It only took me a half hour to discover this problem. It took me about an hour to write the code to fix the problem. The reason why it took so long is because I didn't want someone to have to fix any front-end code to support the fix so it had to be 100% transparent to the GUI and it had to continue to be self maintaining. Just adding the rows to the original table wasn't going to hack that little requirement. Then, it had to go through QA testing where they had to not only verify that the screen was still working correctly but that the underlying data to build the menu was being interpreted correctly. That took one person an hour because they're not database people. In order for them to test what I had done, I had to spend another half hour writing up what I had done and how it worked. That's another 3 hours spent on a problem that could have been avoided by spending an hour to do it right the first time.

For the record, I got each call down to where it belongs at about 200 microseconds and the total number of reads down to just 16 megabytes per hour. There was no rocket science involved in the fix, either. Any developer with just a couple of years of experience could have pulled the same thing off in about an hour of original development time.

So, with apologies to my good friend, Kevin Boles, and to answer your question as to whether or not I believe people should worry about a small comma delimited list taking "just" 30 milliseconds to be used on a relatively infrequent basis against a relatively small number of rows, my answer is a resounding "YES THEY SHOULD!" even if takes them an hour because you don't know how the method will eventually be used by someone. A whole lot of managers really need to learn this particular lesson when they write a development schedule. Compare the one hour invested to the 1,800 hours of wasted employee time and the several hours coming up with and testing a fix.

Why did I apologize to Kevin for this? Because he's an expert performance tuning consultant that gets paid big bucks to fix things like this. There goes more time and money towards something that could have only taken an hour to do right the first time or even the second time they did it… or even just a couple of minutes by someone who knows to expect and is prepared to handle these types of development requirements as a part of their everyday job.

Will the particular method in the article ever be exposed to so much growth? Maybe not but you just don't know for sure because requirements change. Plan for the worst so that if it does happen, you don't have to make any repairs just because scale increased. Bullet proof code just doesn't take that much longer to write and it's worth every penny down the road.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1380758
Posted Sunday, November 4, 2012 1:01 AM
Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Thursday, June 5, 2014 8:23 PM
Points: 363, Visits: 358
Damn! I want that on a sign above my desk. Guess I'll have to settle for a bookmark.


Post #1380763
Posted Sunday, November 4, 2012 11:36 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 8:22 AM
Points: 36,767, Visits: 31,223
GPO (11/3/2012)
Jeff Moden (8/11/2010)

If the data is worth having, it's worth validating.


I'm gonna print that out and stick it on the wall of my office and make people read it to me every time they ask me to write a report that will magically filter out or "correct" all the junk data entered into [insert shoddy application here].




--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1380809
Posted Sunday, November 4, 2012 2:53 PM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Thursday, July 24, 2014 6:03 PM
Points: 22, Visits: 120
Speaking of garbage in - garbage out. In case when I have to split a CSV into a list of items I spend at least one hour trying to understand why some substrings still not spltted until I realised that end-users were able to choose ANY character as a separator.
So, here we go again - no proper DESIGN on any layer, starting from UI. And by the way source data store was not a LEGACY system - it was
One of the brand new vendor's packaged solutions.
Post #1380835
Posted Monday, November 5, 2012 1:21 AM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Friday, July 18, 2014 1:05 AM
Points: 214, Visits: 262
Jeff Moden (11/2/2012)
peter.row (11/2/2012)
Hey Diego nice article. I understand what you meant about the implied performance concerns and I found Jeff to be very abrasive virtually to the point of being offensive.


It would be interesting to know why you actually thought that. Care to explain further? Seriously... I don't like to offend folks but I do also like to get points across about not using RBAR to do such things.


I think I was having a pre-tea early morning moment and apologise (ironically) for being overly harsh.

I guess it was mainly that people were being chastised for not doing stuff based on assumptions on what their use case might be.
Post #1380906
Posted Monday, November 5, 2012 5:23 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 8:22 AM
Points: 36,767, Visits: 31,223
peter.row (11/5/2012)
Jeff Moden (11/2/2012)
peter.row (11/2/2012)
Hey Diego nice article. I understand what you meant about the implied performance concerns and I found Jeff to be very abrasive virtually to the point of being offensive.


It would be interesting to know why you actually thought that. Care to explain further? Seriously... I don't like to offend folks but I do also like to get points across about not using RBAR to do such things.


I think I was having a pre-tea early morning moment and apologise (ironically) for being overly harsh.

I guess it was mainly that people were being chastised for not doing stuff based on assumptions on what their use case might be.

Thanks for the feedback, Peter. I sometimes have the same problem with coffee.

Just to clear the air a bit, it's not the immediate use case that I always worry about. It's how other people might use it for something else. It's not always clear to others how badly something can turn because they've just not seen such a thing happen, yet, and, as you saw in my recent previous post, it can take a whole lot of typing trying to avoid sounding harsh.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1381013
Posted Monday, November 5, 2012 8:15 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Monday, March 31, 2014 3:37 PM
Points: 3, Visits: 55
Jeff, I agree with you. Having thousands or millions of calls to a procedure like this will be a problem. However stored procedures with multiple parameters are not used widely and usually have much slower part after the parsing. Also, as you can see in the article, there are other ways to send values instead of using comma separated strings ... so optimizing the parsing is not the thing I will start with.

Anyway , you are right that there is a faster way to parse the values and it will be stupid to ignore it ( especially if you already tested and prepared the functions for an easy implementation) . If you don't mind I will update my article to include one of the methods in your test and to include a link to the full article.
Post #1381101
« Prev Topic | Next Topic »

Add to briefcase «««910111213»»

Permissions Expand / Collapse