Why doesn’t ISNUMERIC work correctly? (SQL Spackle)

  • jeffem (9/14/2012)


    Very nice article! There is a typo:

    "164 (Yen sign)" should actually reference 165.

    Crud. My apologies. I noticed that in the past (this article has been posted a couple of times now) and just haven't gotten to fixing that (there's another error IIRC but can't recall what that is so I'll have to look for it). Thank you for both the catch and the reminder.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (9/14/2012)


    Lisa Slater Nicholls (9/14/2012)


    I see that the IsReallyNumeric codeset on http://classicasp.aspfaq.com/general/what-is-wrong-with-isnumeric.html, pointed to by various people here, does pretty much what I want.

    Oh, be careful, Lisa. The "filter" built into the method of the article can discriminnate for or against just about anything you want, as well. The problem with the method used in the link you cited is that it uses a scalar function which has some pretty severe performance problems that I wouldn't even justify the use of for supposedly small data sets.

    The method in the article can use an "iSF" or "Inline Scalar Function" (for lack of a better term). Please see the following information for more on that subject.

    http://www.sqlservercentral.com/articles/T-SQL/91724/

    Right Jeff, thanks -- I understand and appreciate the clarification. But, FWIW, I was saying "this is the functionality that I, and others, want built into SQL Server" -- IsInteger() etc. I'm not likely to use the function as-is, except in the following case, which happens to be an important use case in my world: a set of string values is brought into a stored procedure from a fairly primitive web interface for the purpose of an upsert statement, always handling ONE row. Other, similarly costly, validations would also be performed in the procedure, and it's really appropriate here.

    Finally, though, since I have your attention <gd&r>... I thought this piece of code was pretty cool as a method : http://codecorner.galanter.net/2009/04/03/tsql-isnumeric-function-returns-false-positives/ -- but would like your opinion on any downsides?

    >L<

  • Converting into money and from there into int, float or some other numeric data type could be a valid solution for many cases and would perform better than a UDF. Just another option that can save precious time for some people looking into the forum.

    Besides, I do not think it is correct to claim that ISNUMERIC() does not work as the title affirms.

  • josuecalvo (9/14/2012)


    Converting into money and from there into int, float or some other numeric data type could be a valid solution for many cases and would perform better than a UDF. Just another option that can save precious time for some people looking into the forum.

    Besides, I do not think it is correct to claim that ISNUMERIC() does not work as the title affirms.

    If you read the article, I also [font="Arial Black"]don't[/font] think that ISNUMERIC() doesn't work for what it was designed to do. The title of the article was specifically designed to help people that think ISNUMERIC is broken to find a better answer when they Google.

    Your code example also provides a test only for single characters. While I appreciate your suggestion above, I'd like to see the manifestation of your actual working code to solve the problem of how to determine if a string value can be converted to an integer that has the same as the integer value a human would derive from lookinng at the string. For an example, a human can look at a string like 1,2,3,4,5.6 (notice the decimal point, as well) or even just 12345.6 or 1E-3 and realize that it's not a single integer value and would reject it. Please show in working code how your suggestion would arrive at the same conclusion.

    I'm not asking this simply as a challenge. It's very possible that you have a more effective notion than anyone else has presented and would like to see it in the form of working code because I'm just not seeing how it could be made to work more effectively than those things already presented. I'm always up for learning something new and that's especially why I like the discusssions that follow these articles.

    So far as UDFs being some form of performance tax goes, not all UDFs are created equal. Using UDFs isn't a bad thing. Using Scalar and Muli_Line table value functions is.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • What you are asking for is called regex. Google it, there are plenty of very good sites about it. I do not think I can improve its flexibility, so no need to waste time on that.

  • duplicate post deleted

    Tom

  • Lisa Slater Nicholls (9/14/2012)


    But, FWIW, I was saying "this is the functionality that I, and others, want built into SQL Server" -- IsInteger() etc.

    That's fair enough. If you want onme (or perhaps several) functions which are different from IsNumeric it's agood idea to say so. It would of course be better to say so in something clearly intended as input for the standards committees, rather than here.

    Finally, though, since I have your attention <gd&r>... I thought this piece of code was pretty cool as a method : http://codecorner.galanter.net/2009/04/03/tsql-isnumeric-function-returns-false-positives/ -- but would like your opinion on any downsides?>L<

    I can see two downsides: the elephant in the room is the simple fact that isnumeric doesn't return false positives, despite the crazy URL and the crazy text on the page - so the whole webpage is based on a falsehood, and when read by anyone who doesn't recognise and understand that it is an offensively damaging webpage. The second downside is related to the fact (apparently unknown by the author of the article) that "D" and "E" are not the same letter. Those two little problems seem to me to demonstrate that the page isn't worth the screen-space it takes up when anyone bothers to display it. We could then add in questions about the use of commas, points, and so on, whether the test works independently of language settings (although if the test is interopreted as relative to current language settings that might not be a problem - of course if you look at Indian numerics do you work in thousands or in hundreds for seperators after the first thousand - you had better understand both, but the code suggested on the page you reference understands neither).

    But you didn't ask me, you asked Jeff; I think his paper that instigated this discussion makes it pretty clear that he thinks that IsNumeric returns no false positives, but of course I can easily be wrong.

    Tom

  • L' Eomot Inversé (9/14/2012)


    Lisa Slater Nicholls (9/14/2012)


    But, FWIW, I was saying "this is the functionality that I, and others, want built into SQL Server" -- IsInteger() etc.

    That's fair enough. If you want onme (or perhaps several) functions which are different from IsNumeric it's agood idea to say so. It would of course be better to say so in something clearly intended as input for the standards committees, rather than here.

    Finally, though, since I have your attention <gd&r>... I thought this piece of code was pretty cool as a method : http://codecorner.galanter.net/2009/04/03/tsql-isnumeric-function-returns-false-positives/ -- but would like your opinion on any downsides?>L<

    I can see two downsides: the elephant in the room is the simple fact that isnumeric doesn't return false positives, despite the crazy URL and the crazy text on the page - so the whole webpage is based on a falsehood, and when read by anyone who doesn't recognise and understand that it is an offensively damaging webpage. The second downside is related to the fact (apparently unknown by the author of the article) that "D" and "E" are not the same letter. Those two little problems seem to me to demonstrate that the page isn't worth the screen-space it takes up when anyone bothers to display it. We could then add in questions about the use of commas, points, and so on, whether the test works independently of language settings (although if the test is interopreted as relative to current language settings that might not be a problem - of course if you look at Indian numerics do you work in thousands or in hundreds for seperators after the first thousand - you had better understand both, but the code suggested on the page you reference understands neither).

    But you didn't ask me, you asked Jeff; I think his paper that instigated this discussion makes it pretty clear that he thinks that IsNumeric returns no false positives, but of course I can easily be wrong.

    I was not concerned in my question with whether IsNumeric returns false positives -- having examined the documentation closely myself (long before this discussion) I am pretty sure that IsNumeric is correct according to its docs. And I am not trying to validate the page contents as an article, just the code itself.

    I just thought the code in the article was an interesting way to do something. It looked efficient. I had never tried to do anything that way, and not knowing enough, wanted to know if it was considered generally valid.

    While, yes, I asked Jeff, I do appreciate your opinion too and anybody else's -- after all, when I first cited the code I think I explicitly asked for anybody's response! Thank you for yours.

    If you think that the page is offensive and damaging, by all means write to the author and perhaps s/he will improve it or, if sufficiently in agreement with your concerns, remove it. Nobody deliberately writes to mislead people on technical issues.

    Regarding whether it would be a "better" idea to write to the standards committee, I do take your point. Discussions here, however, are often a way to gather information and consensus before trying to suggest a generalized, well-thought-out recommendation to a standards committee -- at least in my world. This discussion helped me see that many people had the same issues that I did and came up with a number of ways to describe what they wanted -- a thing separate from IsNumeric, and also separate from Jeff's IsDigits functionality -- and my original message was an attempt to say "there seem to be a lot of us, it is a suggestion worth pursuing, isn't it?"

    >L<

  • Lisa Slater Nicholls (9/14/2012)


    ronmoses (9/14/2012)


    Rather than parsing the string, how about attempting to CAST the string in a TRY...CATCH block and return a pass/fail value on that basis?

    ron

    The problem here would be performance, no? Especially if you were trying to do this in-line in a sql statement (for example an update), you'd need a function?

    >L<

    You cannot use a function, as you cannot use TRY / CATCH (side affecting) within functions.

    Otherwise I would not have used a procedure in my example above.

    Best Regards,

    Chris Büttner

  • josuecalvo (9/14/2012)


    What you are asking for is called regex. Google it, there are plenty of very good sites about it. I do not think I can improve its flexibility, so no need to waste time on that.

    Wow. The way it is worded, this response comes off as a little rude.

    Jeff is a local SQL Expert who has contributed a lot to this community and to the SQL Server community at large. All he did was ask you if *you* had a solution (different code) that might work better than his, and he did so very politely. He's hardly a rookie that needs to be schooled in the use of Google or Books Online.

    In the future, you might want to consider a better way of phrasing your replies before responding like the above. You might also consider researching the person you're attempting to "school" before you treat them so poorly. You're not likely to endear yourself to many people on these forums by attacking one of our top resources. Jeff's helped a lot of people in these parts.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • josuecalvo (9/14/2012)


    What you are asking for is called regex. Google it, there are plenty of very good sites about it. I do not think I can improve its flexibility, so no need to waste time on that.

    I've not only Googled it in the past, I've had a great number of "conversations" about it on various forums and blogs. In most cases and because of the way SQL Server handles things, if you can do something with LIKE in SQL Server, it will usually be quicker to execute in SQL Server than making a trip to RegEx.

    You offered that a conversion to money would be more effecient and it's very possible that it could be and, since you made the suggestion, I assumed (shame on me) that you had actually done such a thing in the past and wanted to see the code. I did voice a couple of concerns in anticipation of seeing your code to give you a leg up and an opportunity to shine.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Christian Buettner-167247 (9/17/2012)


    Lisa Slater Nicholls (9/14/2012)


    ronmoses (9/14/2012)


    Rather than parsing the string, how about attempting to CAST the string in a TRY...CATCH block and return a pass/fail value on that basis?

    ron

    The problem here would be performance, no? Especially if you were trying to do this in-line in a sql statement (for example an update), you'd need a function?

    >L<

    You cannot use a function, as you cannot use TRY / CATCH (side affecting) within functions.

    Otherwise I would not have used a procedure in my example above.

    Oops, sorry.

    So, for my purposes -- use case scenario is: procedure only, handling the input from a primitive service that passes everything as strings to update a single row -- it might actually be a good idea. There's no chance that somebody would be tempted to abstract it into a function and mis-use it elsewhere!

    And since you couldn't do a TRY/CATCH in-line either, you don't have to think about big data sets from that perspective.

    The remaining perf issue would be "what if the proc were in a position to be called very rapidly by a huge number of clients." Jeff or somebody already alluded to this by saying "don't ever do anything that might not scale, because you never know". I think that's true, but the risk *might* be acceptable here.

    Then you'd be left with non-perf issues, which I'll summarize by saying "what's the most appropriate way to handle the code in the CATCH" and I'm not sure it would be elegant. In general (coming from a .NET, not SQL, perspective) it's not a great idea to do validation in this way.

    Sounds more and more like an equivalent to TryParse (IsInteger, IsDecimal, whatever) is worth a request...

    OTOH: While a native function (by which I mean: a new SQL construct, not us using the CLR) would not have the side-effects restriction even though it might be using TRY/CATCH internally. However if that is how it was implemented, it would still have a slight perf penalty, I bet.

    Sorry, just thinking out loud.

  • I think I have read all the comments (from all the times this article has been published) but I haven't seen anyone mention the new T-SQL conversion functions in SQL Server 2012:

    TRY_CONVERT

    TRY_CAST

    TRY_PARSE

  • SQL Kiwi (9/17/2012)


    I think I have read all the comments (from all the times this article has been published) but I haven't seen anyone mention the new T-SQL conversion functions in SQL Server 2012:

    TRY_CONVERT

    TRY_CAST

    TRY_PARSE

    There you go. Thank you Paul. I think I did sort of mention it by alluding to the .NET equivalent (TryParse) and suggesting that this was the appropriate SQL enhancement request.

    I am not using SQL 2012 very much here yet, and didn't know it was already done!

    However... if they implemented the TRY_* functions via an internal error catch as I said before I think there might be a performance penalty for using it. I remember something like this in the .NET world, for the life of me can't remember what it was and probably wasn't the TryParse function, but it had something to do with GUIDs. If I can remember what it was I will edit this.

  • Lisa Slater Nicholls (9/17/2012)


    However... if they implemented the TRY_* functions via an internal error catch as I said before I think there might be a performance penalty for using it. I remember something like this in the .NET world, for the life of me can't remember what it was and probably wasn't the TryParse function, but it had something to do with GUIDs. If I can remember what it was I will edit this.

    I hear what you are saying, though the issues in an interpreted language like T-SQL can be quite different from those in compiled languages like .NET. TRY_PARSE has more overhead than the others, mostly because it is more powerful and calls into the CLR, but a quick test on a million rows showed TRY_CAST on a 99%-failing conversion to run in 240ms compared with 200ms on an equivalent query without the call. Where all values pass the CAST, there was no measurable difference at all.

    In any case, the assumption must be that TRY_* will only be used where necessary, and (relatively) slow as the intrinsic functions may be, it will likely out-perform any other T-SQL method, while having the advantage of being presumably very well-tested.

Viewing 15 posts - 121 through 135 (of 168 total)

You must be logged in to reply to this topic. Login to reply