Unwanted parameter truncation

  • I'm about ready to blow my brains out. I just spent almost a day chasing a weird error in a stored procedure, only to have it turn out to be because I had specified a varchar parameter length just a bit too short. SQL Server passed the value in without comment, chopped off what didn't fit and went merrily on it's way, generating nonsense as a result. I've shot myself in the foot several times this way, and it's getting kind of old.

    Most of my recent background is development in programming languages that automatically handle string lengths, so I'm not used to worrying about this and I keep forgetting. Does anyone know any good tricks for dealing with this? Automatically make everything 4000? 8000? Max? Put byte counts with each parameter? Special characters at the start and end of every parameter, and routines to test every parameter in every stored procedure? Stab myself with a fork every time I forget?

  • I don't believe there's a way within SQL to prohibit the truncation of the strings when it's assigned to the variable. How is this proc called? If it's from an application then proper form validations should be in place to check for lengths before they are passed into the procedure.

    Automatically setting everything varchar(max) should be okay (maybe more of a poor programming practice than anything) but char(max) is not always a good option...so you may want to avoid that. There is storage/memory "overhead" using char, 1 byte per character, so a char(max) = 8K - i.e. a lot of potentially wasted memory being allocated.

    My advice would be to test for your parameters before assigning them to the parameters being passed into your procedure

    ______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience

  • MyDoggieJessie (12/19/2012)


    I don't believe there's a way within SQL to prohibit the truncation of the strings when it's assigned to the variable. How is this proc called? If it's from an application then proper form validations should be in place to check for lengths before they are passed into the procedure.

    Automatically setting everything varchar(max) should be okay (maybe more of a poor programming practice than anything) but char(max) is not always a good option...so you may want to avoid that. There is storage/memory "overhead" using char, 1 byte per character, so a char(max) = 8K - i.e. a lot of potentially wasted memory being allocated.

    My advice would be to test for your parameters before assigning them to the parameters being passed into your procedure

    Just to be clear, there isn't any such thing as CHAR(MAX) nor NCHAR(MAX) nor BINARY(MAX). Only the variable length type of character based datatypes have the MAX capability. The max for char is 8k as indicated above but it must be defined as CHAR(8k). That's also typically a bad idea for the reason given above.

    There's also a disadvantage to using (for example) VARCHAR(MAX). The MAX datatypes really don't like to be joined to and it will slow down queries. "Right Sizing" is always the correct answer but, if you're unsure of what the "right size" actually is, create the variable to be 1 larger than what you think the "right size" is. If the DATALENGTH ends up being greater than your "right size", throw an error. While this may seem to be a chore, its part of the parameter validation that I would normally do on varchars that I'm not sure won't grow. Examples of that are address columns, name columns, etc.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • @myself - ID10T error, of course! 8000 is the limit

    ______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience

  • MyDoggieJessie (12/19/2012)


    @myself - ID10T error, of course! 8000 is the limit

    Nah... no problem. I've made a similar mistake when I had a NEC (Not Enough Coffee) problem.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • MyDoggieJessie (12/19/2012)


    I don't believe there's a way within SQL to prohibit the truncation of the strings when it's assigned to the variable. How is this proc called? If it's from an application then proper form validations should be in place to check for lengths before they are passed into the procedure.

    It's called from an Access application.

    Automatically setting everything varchar(max) should be okay (maybe more of a poor programming practice than anything) but char(max) is not always a good option...so you may want to avoid that. There is storage/memory "overhead" using char, 1 byte per character, so a char(max) = 8K - i.e. a lot of potentially wasted memory being allocated.

    char(max) would not be a problem - when I know exactly what I'm dealing with, the length is not an issue. It's the varchar stuff, when I expect maybe twenty characters, so I declare varchar(100) to be safe, then I change the calling logic, or add some padding or delimiters, or some such, and suddenly I'm over 100. Sometimes the procedure crashes, depending on the logic, but sometimes it runs fine, just returns weird results, and I waste a lot of time trying to troubleshoot obscure JOIN and WHERE clauses, when the problem is simply a chopped-off parameter.

    My advice would be to test for your parameters before assigning them to the parameters being passed into your procedure

    Yeah, but what's happening to me is not that the code is passing in stuff that doesn't belong, it's that I misjudge how much room I'm going to need, and then don't get any warning when I run over. If it at least threw an error or gave me some indication that things were not quite right, I'd be able to deal with it easily. If I knew to validate, I could just as easily leave enough room. The problem is that I screw it up, and then don't know that it's screwed up. And I can easily screw up the validation, since there is no solid link between what the application delivers and what SQL Server expects.

    I got my hand slapped a few months back when I posted some code that had varchar(max) all over the place. My reasoning was that it worked somewhat like a high-level language string handling facility, with essentially unlimited lengths. The arguments presented against it at the time (performance hits, among others) convinced me to stop using it, but I'm beginning wonder if the shortcomings might not be worth it.

  • Jeff Moden (12/19/2012)


    There's also a disadvantage to using (for example) VARCHAR(MAX). The MAX datatypes really don't like to be joined to and it will slow down queries.

    So I've been told, but how much? A few milliseconds greater response time against days worth of me tearing my hair out is not a difficult choice.

    "Right Sizing" is always the correct answer but, if you're unsure of what the "right size" actually is, create the variable to be 1 larger than what you think the "right size" is.

    No argument on "Right Sizing", but that's the problem - I don't always know what is going to be fed to some of these procedures, so I estimate, add what I think is a reasonable fudge factor and then discover I didn't leave enough. One example is a delimited string. I have to parse the string and search for stuff. If the string to be parsed is longer than I expected, one of the parameters can get cut in two, and I go looking for part of a parameter, which may or may not give sensible results.

    If the DATALENGTH ends up being greater than your "right size", throw an error. While this may seem to be a chore, its part of the parameter validation that I would normally do on varchars that I'm not sure won't grow. Examples of that are address columns, name columns, etc.

    How do I find out that it's too long? That's my problem - SQL Server simply chops it off, instead of telling me that I have too much. And manually putting in logic to test the length in the application leaves me with the same problem - I have no machine-reliable way of telling how long a parameter SQL Server will allow, so I'm back to manually examining the T-SQL code, and coding the application to match. If one changes, there's no way for the other to know. I can easily change one and forget to do the other, which is where I am now.

  • The code calling the procedure should test string lengths to see if they exceed what the procedure expects. You code the procedure with a specified size, test for it in the application before making the call.

  • pdanes (12/19/2012)


    Jeff Moden (12/19/2012)


    There's also a disadvantage to using (for example) VARCHAR(MAX). The MAX datatypes really don't like to be joined to and it will slow down queries.

    So I've been told, but how much? A few milliseconds greater response time against days worth of me tearing my hair out is not a difficult choice.

    "Right Sizing" is always the correct answer but, if you're unsure of what the "right size" actually is, create the variable to be 1 larger than what you think the "right size" is.

    No argument on "Right Sizing", but that's the problem - I don't always know what is going to be fed to some of these procedures, so I estimate, add what I think is a reasonable fudge factor and then discover I didn't leave enough. One example is a delimited string. I have to parse the string and search for stuff. If the string to be parsed is longer than I expected, one of the parameters can get cut in two, and I go looking for part of a parameter, which may or may not give sensible results.

    If the DATALENGTH ends up being greater than your "right size", throw an error. While this may seem to be a chore, its part of the parameter validation that I would normally do on varchars that I'm not sure won't grow. Examples of that are address columns, name columns, etc.

    How do I find out that it's too long? That's my problem - SQL Server simply chops it off, instead of telling me that I have too much. And manually putting in logic to test the length in the application leaves me with the same problem - I have no machine-reliable way of telling how long a parameter SQL Server will allow, so I'm back to manually examining the T-SQL code, and coding the application to match. If one changes, there's no way for the other to know. I can easily change one and forget to do the other, which is where I am now.

    A few milliseconds response time is nothing compared to your hair pulling time but you do have to remember the big picture as you code. If, for example, the parameter is used on a function that is called millions of times a day, the milliseconds you're talking about becomes a non-trivial load on the server. I'm currently fighting dozens of such problems on a system I'm trying to cleanup. One such function (for example) was using 45 cpu minutes per day and I got that down to about 4.5 cpu seconds per day by using such simple tweaks. And, yeah... put enough of those together and you end up with some real savings on a busy server.

    So, use the +1 method I told you about if there are any doubts. Either that or remember the hair pulling you just went through and make a truncation check one of the first things you do rather than the last. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • pdanes (12/19/2012)


    I declare varchar(100) to be safe, then I change the calling logic, or add some padding or delimiters, or some such, and suddenly I'm over 100.

    I'm going to be unpopular here but...

    Dependency analysis?

    Regression testing?

    Unit tests?

    Sure, making all parameters and variables varchar(max) would work. Shall we make all table columns varchar max too (if not, then you might get silent truncation inserting into a table)?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Whatever your data structures are, so should your parameters be, both in data type and length. What do the structures look like?

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • GilaMonster (12/19/2012)


    I'm going to be unpopular here but...

    Dependency analysis?

    Regression testing?

    Unit tests?

    Sure, making all parameters and variables varchar(max) would work. Shall we make all table columns varchar max too (if not, then you might get silent truncation inserting into a table)?

    Come on, Gail, I'm serious. You can always expand a scenario out to the point where it's completely retarded. That doesn't help. I'm asking how people deal with the fact that there is no solid link between an application passing parameters and SQL Server accepting parameters. If I call a stored procedure with mismatched paramaters, I get a warning. If I call it with excessively long parameters, I don't. It's easy to just tell me to not be stupid and check everything, but I'm human - I overlook things, I get distracted, I think I checked something, but then forget to save it... You've certainly done your share of such things, we all do. I'm wondering if anyone has developed some sort of systematic way of keeping an eye on themselves when working with this particular situation - application parameters passed to SQL Server, since in this case, the software doesn't alert you to mistakes, as it does in many other situations.

  • pdanes (12/19/2012)


    GilaMonster (12/19/2012)


    I'm going to be unpopular here but...

    Dependency analysis?

    Regression testing?

    Unit tests?

    Sure, making all parameters and variables varchar(max) would work. Shall we make all table columns varchar max too (if not, then you might get silent truncation inserting into a table)?

    Come on, Gail, I'm serious. You can always expand a scenario out to the point where it's completely retarded. That doesn't help. I'm asking how people deal with the fact that there is no solid link between an application passing parameters and SQL Server accepting parameters. If I call a stored procedure with mismatched paramaters, I get a warning. If I call it with excessively long parameters, I don't. It's easy to just tell me to not be stupid and check everything, but I'm human - I overlook things, I get distracted, I think I checked something, but then forget to save it... You've certainly done your share of such things, we all do. I'm wondering if anyone has developed some sort of systematic way of keeping an eye on themselves when working with this particular situation - application parameters passed to SQL Server, since in this case, the software doesn't alert you to mistakes, as it does in many other situations.

    Umm... Gail's suggestion is to catch the mistakes made by human error. If you don't, you are doing your employer a disservice. I believe that data validation should occur on the application side and not on the database side. Data relationship integrity is where the database constraints come in. Gail's point is, test all possible situations.

    Jared
    CE - Microsoft

  • Grant Fritchey (12/19/2012)


    Whatever your data structures are, so should your parameters be, both in data type and length. What do the structures look like?

    Well, the data structure may be a varchar(1000) containing, for instance, the locality where a particular fossil specimen was found. I may want to find all specimens that were found somewhere in Germany. My parameter then might look like 'Germany|Deutschland|GDR|Nemecko'. I would parse the parameter to generate a WHERE clause like

    WHERE Locality Like '%Germany%' OR Locality Like '%Deutschland%' OR Locality Like '%GDR%' OR Locality Like '%Nemecko%'

    The particular parameter that gave me my latest ulcer was:

    '1E,D,E,Es,F,Fs,G,Gs,H,Hs,J,K,Ks,L,Ls,Lc,M,Me,N,Ns,O,Os,Ob,Oc,P,Ps,Pa,Pb,Pc,R,Ra,S,Ss,Sb,Sc,T,Ts,Ta,Tb,Tc,Ua,Ub,Uc,Va,Z,Zs'

    It gets parsed into a table variable that then gets matched against another table. This combination of letters is exhaustive - there are 45, and extremely unlikely to be any more, at least any time soon. They're title codes of historical catalogs, and not one has been added in the fifteen years I've been working with this institution. So, 45 * 2 = 90, and some are only one letter instead of two, varchar(100) should be safe. Except for the commas... Varchar(max), or at least varchar(4000) would have completely prevented a whole bunch of frustration. So would have not making the mistake of overlooking the commas - if you know some trick for preventing stupid mistakes, I'd certainly welcome that.

  • Jeff Moden (12/19/2012)


    So, use the +1 method I told you about if there are any doubts. Either that or remember the hair pulling you just went through and make a truncation check one of the first things you do rather than the last. 😉

    Again, how? Do you mean simply code the app so that it checks the length before passing the parameter? It's something, but again, easy to forget or get wrong. When you say 'truncation check', do you know of some reliable way to detect that such truncation has occurred?

Viewing 15 posts - 1 through 15 (of 52 total)

You must be logged in to reply to this topic. Login to reply