Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase «««1213141516»»»

Why doesn’t ISNUMERIC work correctly? (SQL Spackle) Expand / Collapse
Author
Message
Posted Monday, September 17, 2012 6:26 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 8:22 AM
Points: 36,767, Visits: 31,223
josuecalvo (9/14/2012)
What you are asking for is called regex. Google it, there are plenty of very good sites about it. I do not think I can improve its flexibility, so no need to waste time on that.


I've not only Googled it in the past, I've had a great number of "conversations" about it on various forums and blogs. In most cases and because of the way SQL Server handles things, if you can do something with LIKE in SQL Server, it will usually be quicker to execute in SQL Server than making a trip to RegEx.

You offered that a conversion to money would be more effecient and it's very possible that it could be and, since you made the suggestion, I assumed (shame on me) that you had actually done such a thing in the past and wanted to see the code. I did voice a couple of concerns in anticipation of seeing your code to give you a leg up and an opportunity to shine.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1360145
Posted Monday, September 17, 2012 7:56 AM
Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Monday, July 7, 2014 8:54 AM
Points: 304, Visits: 518
Christian Buettner-167247 (9/17/2012)
Lisa Slater Nicholls (9/14/2012)
ronmoses (9/14/2012)
Rather than parsing the string, how about attempting to CAST the string in a TRY...CATCH block and return a pass/fail value on that basis?

ron


The problem here would be performance, no? Especially if you were trying to do this in-line in a sql statement (for example an update), you'd need a function?

>L<

You cannot use a function, as you cannot use TRY / CATCH (side affecting) within functions.
Otherwise I would not have used a procedure in my example above.


Oops, sorry.

So, for my purposes -- use case scenario is: procedure only, handling the input from a primitive service that passes everything as strings to update a single row -- it might actually be a good idea. There's no chance that somebody would be tempted to abstract it into a function and mis-use it elsewhere!

And since you couldn't do a TRY/CATCH in-line either, you don't have to think about big data sets from that perspective.

The remaining perf issue would be "what if the proc were in a position to be called very rapidly by a huge number of clients." Jeff or somebody already alluded to this by saying "don't ever do anything that might not scale, because you never know". I think that's true, but the risk *might* be acceptable here.

Then you'd be left with non-perf issues, which I'll summarize by saying "what's the most appropriate way to handle the code in the CATCH" and I'm not sure it would be elegant. In general (coming from a .NET, not SQL, perspective) it's not a great idea to do validation in this way.

Sounds more and more like an equivalent to TryParse (IsInteger, IsDecimal, whatever) is worth a request...

OTOH: While a native function (by which I mean: a new SQL construct, not us using the CLR) would not have the side-effects restriction even though it might be using TRY/CATCH internally. However if that is how it was implemented, it would still have a slight perf penalty, I bet.

Sorry, just thinking out loud.

Post #1360217
Posted Monday, September 17, 2012 8:56 AM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Yesterday @ 3:56 PM
Points: 11,192, Visits: 11,096
I think I have read all the comments (from all the times this article has been published) but I haven't seen anyone mention the new T-SQL conversion functions in SQL Server 2012:

TRY_CONVERT
TRY_CAST
TRY_PARSE




Paul White
SQL Server MVP
SQLblog.com
@SQL_Kiwi
Post #1360259
Posted Monday, September 17, 2012 9:31 AM
Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Monday, July 7, 2014 8:54 AM
Points: 304, Visits: 518
SQL Kiwi (9/17/2012)
I think I have read all the comments (from all the times this article has been published) but I haven't seen anyone mention the new T-SQL conversion functions in SQL Server 2012:

TRY_CONVERT
TRY_CAST
TRY_PARSE


There you go. Thank you Paul. I think I did sort of mention it by alluding to the .NET equivalent (TryParse) and suggesting that this was the appropriate SQL enhancement request.

I am not using SQL 2012 very much here yet, and didn't know it was already done!

However... if they implemented the TRY_* functions via an internal error catch as I said before I think there might be a performance penalty for using it. I remember something like this in the .NET world, for the life of me can't remember what it was and probably wasn't the TryParse function, but it had something to do with GUIDs. If I can remember what it was I will edit this.
Post #1360284
Posted Monday, September 17, 2012 10:51 AM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Yesterday @ 3:56 PM
Points: 11,192, Visits: 11,096
Lisa Slater Nicholls (9/17/2012)
However... if they implemented the TRY_* functions via an internal error catch as I said before I think there might be a performance penalty for using it. I remember something like this in the .NET world, for the life of me can't remember what it was and probably wasn't the TryParse function, but it had something to do with GUIDs. If I can remember what it was I will edit this.

I hear what you are saying, though the issues in an interpreted language like T-SQL can be quite different from those in compiled languages like .NET. TRY_PARSE has more overhead than the others, mostly because it is more powerful and calls into the CLR, but a quick test on a million rows showed TRY_CAST on a 99%-failing conversion to run in 240ms compared with 200ms on an equivalent query without the call. Where all values pass the CAST, there was no measurable difference at all.

In any case, the assumption must be that TRY_* will only be used where necessary, and (relatively) slow as the intrinsic functions may be, it will likely out-perform any other T-SQL method, while having the advantage of being presumably very well-tested.




Paul White
SQL Server MVP
SQLblog.com
@SQL_Kiwi
Post #1360338
Posted Monday, September 17, 2012 11:17 AM
SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Monday, May 26, 2014 10:11 AM
Points: 452, Visits: 820
Very nicely thought indeed. Great Not to mention how tricky it is to understand NOT LIKE %[NOT DIGITS]% Also the reasons why ISNUMERIC fails (do not work as desired ) are very informative.

Only one thing I wanted to say is that requests for handling decimal and positive/negative numbers seems quite rightful to me (May be because I had to deal with such scenario before )

Talking of inline code OR iSF, may be something like the following could be a starting point

/*======= POPULATION OF TEMPORARY DATA =======*/
IF OBJECT_ID('tempdb..#test') IS NOT NULL
DROP TABLE #test
GO
CREATE TABLE #test ( string VARCHAR(10) )

INSERT INTO #test
SELECT '12334'
UNION ALL SELECT 'abc1234'
UNION ALL SELECT '123.4'
UNION ALL SELECT '-1233.'
UNION ALL SELECT '-123.33'
UNION ALL SELECT '-123.33.'
UNION ALL SELECT '-123.33-'
UNION ALL SELECT '.123'
UNION ALL SELECT ''
UNION ALL SELECT '123.33-'
UNION ALL SELECT '123.33+'
UNION ALL SELECT '-123.33+'
/*========================================*/


SELECT *
FROM #test
/*========= TRIM THE STRING AND REPLACE ANY UNWANTED CHARACTERS ===========================================*/
CROSS APPLY ( SELECT REPLACE(REPLACE(REPLACE(REPLACE(LTRIM(RTRIM([string])), --TRIM TRAILING SPACES
CHAR(9),''), --REPLACE ALL TABS WITH ZERO LENGTH STRING
CHAR(10), ''), --REPLACE ALL LINE FEEDS WITH ZERO LENGTH STRING
CHAR(13), ''), --REPLACE ALL CARRIAGE RETURNS WITH ZERO LENGTH STRING
CHAR(32),'') --REPLACE ALL SINGLE SPACES WITH ZERO LENGTH STRING
) TR ( TrimmedString )

CROSS APPLY ( SELECT LEFT(TrimmedString, 1), --FIRST CHARACTER OF THE STRING
RIGHT(TrimmedString, 1) --LAST CHARACTER OF THE STRING
) FLCHR ( FirstCharacter, LastCharacter )

/*========= +VE and -VE SIGNs ARE REMOVED ONLY IF THOSE ARE FIRST OR LAST CHARACTERS ========================*/
CROSS APPLY ( SELECT CASE WHEN FirstCharacter IN ('-','+') --CHECK IF FIRST LETTER OF STRING IS EQUAL TO NEGATIVE SIGN "-" OR POSITIVE SIGN "+"
AND LastCharacter NOT IN ('-','+') --AND LAST LETTER IS NOT EQUAL TO NEGATIVE SIGN "-" OR POSITIVE SIGN "+"
THEN RIGHT(TrimmedString, LEN(TrimmedString) - 1) --THEN REMOVE THE FIRST CHARACTER

WHEN LastCharacter IN ('-','+') --CHECK IF LAST LETTER OF STRING IS EQUAL TO NEGATIVE SIGN "-" OR POSITIVE SIGN "+"
AND FirstCharacter NOT IN ('-','+') --AND FIRST LETTER IS NOT EQUAL TO NEGATIVE SIGN "-" OR POSITIVE SIGN "+"
THEN LEFT(TrimmedString, LEN(TrimmedString) - 1) --THEN REMOVE THE LAST CHARACTER

ELSE TrimmedString -- OTHERWISE KEEP THE ACTUAL STRING
END
) SWAS ( StringWithoutAnySIGN )

/*========== DECIMAL IS REMOVED IF THERE IS ONLY ONE INSTANCE OF IT. OTHERWISE THE INPUT SHOULD BE CONSIDERED AS WRONG ===*/
CROSS APPLY ( SELECT CASE WHEN CHARINDEX('.', StringWithoutAnySIGN,
ISNULL(NULLIF(CHARINDEX('.', StringWithoutAnySIGN),0)+1,
LEN(StringWithoutAnySIGN)+ 1)) > 0 --IF THERE ARE MULTIPLE OCCURENCES OF DECIMALS i.e. "."
THEN StringWithoutAnySIGN -- KEEP THE NON NEGATIVE STRING
ELSE REPLACE(StringWithoutAnySIGN, '.', '') --OTHERWISE REPLACE THE DECIMAL FROM THE STRING
END
) y ( FinalString )
/*======= -- UNCOMMENT THE FOLLOWING BLOCK TO APPLY THE FILTER ==*/
--WHERE [y].[FinalString] NOT LIKE '%[^0-9]%'
--AND LEN([y].[FinalString]) -- ONLY STRINGS WITH MORE THAN ZERO CHARACTER ARE TO BE INCLUDED.
-- > 0 -- ALSO LEN FUNCTION DO NOT COUNT TRAILING SPACES
/*===============================================================*/

It is all according to what I thought could be my requirements. Everyone can have his/her own requirements. Like having +/- at the end of the string may be invalid to someone. So your IsAllDigits function seems to be a more generic approach and can be easily modified such like I did.

Another thing I thought of that currency part can be handled the same way. Either it can be starting character OR can be a second character if the +/- are the first characters. Vice versa is the case for +/- in case of first and second characters while dealing with currency one character abbreviations. (Although in my case it was 3 characters abbreviation ISO standard and needed quite a bit of manual intervention )
Post #1360363
Posted Monday, September 17, 2012 11:37 AM
SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Monday, May 26, 2014 10:11 AM
Points: 452, Visits: 820
SQL Kiwi (9/17/2012)

In any case, the assumption must be that TRY_* will only be used where necessary, and (relatively) slow as the intrinsic functions may be, it will likely out-perform any other T-SQL method, while having the advantage of being presumably very well-tested.


Very valuable point indeed. It would have been tested in a much better way. I hope there are no bugs left in those functions
But one thing I cannot understand is that if these functions are CLRs then why they cannot be part of earlier versions? Why only SQL 2012? I know MS could come up with various reasons and may could need manual intervention, but still could have been possible? Any way of finding the respective DLLs kind of things and register? Is it legitimate (I guess it would not be)? May be remembering the exercise of such data cleansing is making me more pessimistic
Post #1360377
Posted Monday, September 17, 2012 12:25 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Wednesday, June 18, 2014 10:58 AM
Points: 6, Visits: 119
Brandy

I do not question Jeff merits and impressive contributions, nor I question yours. honestly, no sarcasm here. But I did feel his response as being as rude to me in the first place as my answer was to him. It is your opinion it was not, but you cannot know how I felt about that.

If I may have offended Jeff I do apologise to him, but again I think that is for Jeff to say.

On top of that, do you mean that because I am a rookie my point is not valid? What is my skill level is not for me to say but I do not think it is healthy to discuss the validity of a technical point based on someone's resume or community contributions background rather than proof.

Besides:
- I still think that what Jeff was asking me to code should rather be done using regex.
- I still think that the title of the article is rather unfortunate and I do not agree with Jeff explanation of why he choose it. However that was his call and this is my personal opinion.

Post #1360394
Posted Monday, September 17, 2012 1:07 PM
Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Monday, July 7, 2014 8:54 AM
Points: 304, Visits: 518
SQL Kiwi (9/17/2012)
[...] the issues in an interpreted language like T-SQL can be quite different from those in compiled languages like .NET. TRY_PARSE has more overhead than the others, mostly because it is more powerful and calls into the CLR, but a quick test on a million rows showed TRY_CAST on a 99%-failing conversion to run in 240ms compared with 200ms on an equivalent query without the call. Where all values pass the CAST, there was no measurable difference at all.

In any case, the assumption must be that TRY_* will only be used where necessary, and (relatively) slow as the intrinsic functions may be, it will likely out-perform any other T-SQL method, while having the advantage of being presumably very well-tested.


Totally agree. I would definitely use it without real concern -- unlike implementing in CLR extensions myself.

And it is also much easier to read/maintain than any other method . Honestly I would echo somebody else on this thread in saying I can't understand why MS didn't do it earlier.

>L<
Post #1360417
Posted Monday, September 17, 2012 1:25 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 8:22 AM
Points: 36,767, Visits: 31,223
josuecalvo (9/17/2012)
Brandy

I do not question Jeff merits and impressive contributions, nor I question yours. honestly, no sarcasm here. But I did feel his response as being as rude to me in the first place as my answer was to him. It is your opinion it was not, but you cannot know how I felt about that.

If I may have offended Jeff I do apologise to him, but again I think that is for Jeff to say.

On top of that, do you mean that because I am a rookie my point is not valid? What is my skill level is not for me to say but I do not think it is healthy to discuss the validity of a technical point based on someone's resume or community contributions background rather than proof.

Besides:
- I still think that what Jeff was asking me to code should rather be done using regex.
- I still think that the title of the article is rather unfortunate and I do not agree with Jeff explanation of why he choose it. However that was his call and this is my personal opinion.



Nope... no offense taken. You stated that this could be done with a conversion to the money datatype and I'm still looking forward to that code.

Additionally, from previous tests done, I still believe that RegEx will be slower and unnecessary in this case. If you'd be willing to do a test on your machine, I could easily provide you with a million row test table so we can find out.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1360426
« Prev Topic | Next Topic »

Add to briefcase «««1213141516»»»

Permissions Expand / Collapse