Using PATINDEX to find all numeric values

Question

Post reply

Using PATINDEX to find all numeric values

cdun2

SSCertifiable

Points: 7893
More actions
February 14, 2008 at 3:07 pm

#182175

Hello,
I forget. How can PATINDEX be used to find column values where the data is all numeric? I've tried, the following:
select distinct acct from tbl_CYProcessedSales
where PatIndex('%[0-9]%',Acct) > 0
order by acct
Acct is varchar(8). What am I doing wrong?
Thank you for your help!
CSDunn

Viewing 13 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply

Adam Haines SSC-Insane Points: 23217 More actions · Answer 1

I would use ISNUMERIC

select distinct acct

from tbl_CYProcessedSales

where ISNUMERIC(Acct) > 0

order by acct

My blog: http://jahaines.blogspot.com

Jeff Moden SSC Guru Points: 1004619 More actions · Answer 2

Oh... be careful, now... IsNumeric cannot be equated to IsAllDigits. IsNumeric will allow currency symbols, $, comma, decimal point, spaces, tabs, and a host of other individual characters. I will also allow for combinations of charcters...

SELECT ISNUMERIC('2d2'), ISNUMERIC('2e2')

In fact, I wrote a pretty hefty article about this on another web site...

ISNUMERIC is not “ALL DIGITS”Submitted by Jeff Moden, 03 Jun 2006

Introduction:

There are many cases where you need to ensure that the string data you are working with includes only numeric digits. Most Developers will use the built in ISNUMERIC function to make such a check. Here’s why that’s a bad idea and what to do about it.

What is ISNUMERIC?

“Books OnLine” summarizes the description of the ISNUMERIC function as:

“Determines whether an expression is a valid numeric type.”

and that’s a 100% accurate description that leaves much to be desired. Just what is a “valid numeric type”? Reading further in BOL (short for “Books OnLine), we find additional information:

“ISNUMERIC returns 1 when the input expression evaluates to a valid integer, floating point number, money or decimal type; otherwise it returns 0. A return value of 1 guarantees that expression can be converted to one of these numeric types.”

Again, read the wording… “when the input expression evaluates to a valid integer”, etc, etc. And, that’s the catch. There are many different things that you may not expect that will evaluate to one of the data types listed in the description of ISNUMERIC and a lot of them are NOT the digits 0-9. ISNUMERIC will return a “1” for all of them.

Let’s consider the most obvious… what will ISNUMERIC(‘-10’) return? What will ISNUMERIC(‘1,000’) return? And how about the not-so-obvious… what will ISNUMERIC('0d1234') or ISNUMERIC('13e20') return? There are many different combinations of letters, numbers, and symbols that can actually be converted to numeric data types and ISNUMERIC will return a “1” for all of them. It’s not a flaw… that’s the way it’s supposed to work!

What IS Actually Considered “Numeric” by ISNUMERIC?

This code will show all of the single characters that ISNUMERIC thinks of as “Numeric”…

--===== Return all characters that ISNUMERIC thinks is numeric

-- (uses values 0-255 from the undocumented spt_Values table

-- instead of a loop from 0-255)

SELECT [Ascii Code] = STR(Number),

[Ascii Character] = CHAR(Number),

[ISNUMERIC Returns] = ISNUMERIC(CHAR(Number))

FROM Master.dbo.spt_Values

WHERE Name IS NULL

AND ISNUMERIC(CHAR(Number)) = 1

That code produces the following list of characters…

Ascii Code Ascii Character ISNUMERIC Returns

---------- --------------- -----------------

9 1

10

1

11

1

12 1

13

1

36 $ 1

43 + 1

44 , 1

45 - 1

46 . 1

48 0 1

49 1 1

50 2 1

51 3 1

52 4 1

53 5 1

54 6 1

55 7 1

56 8 1

57 9 1

128 € 1

160 1

163 £ 1

164 ¤ 1

165 ¥ 1

What are these characters?

Ascii 9 is a TAB character and is included because a column of numbers is frequently delimited by a TAB.

Ascii 10 is a Line Feed character and is included because the last column of numbers is frequently terminated by a Line Feed character.

Ascii 11 is a Vertical Tab character and is included because the last column of numbers is frequently terminated by a Vertical Tab character.

Ascii 12 is a Form Feed character and is included because the last column numbers of the last row is sometimes terminated by a Form Feed character.

Ascii 13 is a Carriage Return character and is included because the last column of numbers is frequently terminated by a Carriage Return character.

Ascii 36 (Dollar sign), 128 (Euro sign), 163 (British Pound sign), and 164 (Yen sign) are included because they are frequently used as enumerators to identify the type of number or, in this case, the currency type the number is meant to represent.

Ascii 43 (Plus sign), 44 (Comma), 45 (Minus sign), and 46 (Decimal place) are included because they are frequently included in numeric columns to mark where on the number line the number appears and for simple formatting.

Ascii 160 is a special "hard space" and is included because it is frequently used to left pad numeric columns so the column of numbers appears to be right justified.

Ascii 32 is a "soft space" and is not included because a single space does not usually represent a column of numbers. Ascii 32 is, however, a valid numeric character when used to create right justified numbers as is Ascii 160 but a single Ascii 32 character is NOT numeric. In fact, a string of Ascii 32 spaces is not considered to be numeric but a string of spaces with even a single digit in it is considered to be numeric.

Ascii 164 is a special character and is included because it is frequently used by accountants and some software to indicate a total or subtotal of some type. It is also used by some to indicate they don't know what the enumerator is.

Ascii 48-59 are included because they represent the digits 0 through 9

Set of Characters Treated as “Numeric” by ISNUMERIC

Do notice that "e" and "d" (everybody forgets about this) are not included as numeric in the results because a single "e" or "d is NOT considered to be numeric. HOWEVER, these letters are for two different forms of scientific notation. So, if you have anything that looks like the following, ISNUMERIC will identify them as “Numeric”…

SELECT ISNUMERIC('0d2345')

SELECT ISNUMERIC('12e34')

The “Rational” Solution

Hopefully, I’ve proven that ISNUMERIC is NOT the way to determine if a value or a column of values IS ALL DIGITS. So, what to do? We could write something really complex that loops through each character to see if it’s a digit… or … we can use a very simple rational expression to do the dirty work for us. The formula is…

NOT LIKE '%[^0-9]%'

… and it can be used directly (preferred method for performance reasons)…

SELECT *

FROM sometable

WHERE somecolumn NOT LIKE '%[^0-9]%'

… or, if you don’t mind the performance hit, you can create your own “IsAllDigits” function…

CREATE FUNCTION dbo.IsAllDigits

/********************************************************************

Purpose:

This function will return a 1 if the string parameter contains only

numeric digits and will return a 0 in all other cases.

--Jeff Moden

********************************************************************/

--===== Declare the I/O parameters

(@MyString VARCHAR(8000))

RETURNS INT

AS

BEGIN

RETURN (SELECT CASE

WHEN @MyString NOT LIKE '%[^0-9]%'

THEN 1

ELSE 0

END)

END

If you have any questions on this article, please don’t hesitate to post them. And thanks for taking the time to read it and, perhaps, get your favorable vote ?.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Adam Haines SSC-Insane Points: 23217 More actions · Answer 3

Good article Jeff.

I guess the method of choice comes down to knowing your data. If his column is what I think it is, an integer column converted to a varchar; he should be alright, but you never know.

My blog: http://jahaines.blogspot.com

Julie Breutzmann Hall of Fame Points: 3637 More actions · Answer 4

In fact, I wrote a pretty hefty article about this on another web site...
ISNUMERIC is not “ALL DIGITS”Submitted by Jeff Moden, 03 Jun 2006
All rights reserved.

Thanks Jeff. Great article and very understandable, even for those of us who are NOT DBA's.

Julie

Jeff Moden SSC Guru Points: 1004619 More actions · Answer 5

Thanks for the feedback, Julie... hmm... maybe I should publish it here, as well.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Sean Peters-452786 SSC Enthusiast Points: 170 More actions · Answer 6

Of course this isnt a perfect solution:

What you would really like to isolate are values that won't cast as float like '11,22,33' or 1234e456', etc.

Your new function returns a value of "0" for values like '1,000,000' or '1 e -23' which will actuall cast as float just fine. Which are valid numeric values.

Now if somebody can find an efficient way of doing that short of opening a cursor and cast each value that would be usefull . . .

Jeff Moden SSC Guru Points: 1004619 More actions · Answer 7

Heh... of course it's a perfect solution... but only for what it was designed for... it's an IsAllDigits function and wasn't meant to be anything else. If you want to be able to detect anything that will convert to Float or Money, then IsNumeric works just fine.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Sean Peters-452786 SSC Enthusiast Points: 170 More actions · Answer 8

No IsNumeric('11,22,33') = 1 but cast ('11,22,33' as float) will raise an error. So isnumeric is not the perfect way to select data that will cast as float, money, or numeric, etc. from a char datatype and all its cousins.

You do have a point about working as designed. I guess I just don't see exactly how it could be useful if it's not a better mouse trap. Say hypothetically I needed a way to eliminate all char type values from a field before I casted them as float. This query:

Select Cast(columnA as float) From Table1 Where Isnumeric(ColumnA)

falls short because of the issue I raised above. I can still raise an error converting certain char values that pass the isnumeric sniff test but will not actually cast as float. If I use your function I get the opposite:

Select Cast(columnA float) From Table1 Where IsAllDigits(ColumnA)

Here I run the risk of omitting values that could safely cast as float. So in the end, IsAllDigits is fine as far as working as designed. It just may be as useful as it seems.

Nice write up though, don't get me wrong . . . =)

thegreatsamarth SSChasing Mays Points: 638 More actions · Answer 9

cdun2 (2/14/2008)
Hello,
I forget. How can PATINDEX be used to find column values where the data is all numeric? I've tried, the following:
select distinct acct from tbl_CYProcessedSales
where PatIndex('%[0-9]%',Acct) > 0
order by acct
Acct is varchar(8). What am I doing wrong?
Thank you for your help!
CSDunn

As mentioned, u want to find all the columns where data is all NUMERIC.... so will it include special symbols???

Anywayz, If u have to use PATINDEX, refer below query:

This Query exclude the data having special symbols.

SELECT DISTINCT acct

FROM tbl_CYProcessedSales

WHERE patindex('%[~`!@#$%^&*()_=+\|{};",<>/?a-z]%',acct)=0

ORDER BY acct

I hope this is what you want...

--Samarth

Jeff Moden SSC Guru Points: 1004619 More actions · Answer 10

Sean Peters (7/31/2008)
No IsNumeric('11,22,33') = 1

Ummm... nope... it isn't...

SELECT ISNUMERIC('11,22,33')

-----------

0

(1 row(s) affected)

I will agree, though, that ISNUMERIC isn't the panacea that some make it out to be. If it looks like the MONEY datatype (has comma's in the correct spots or a currency symbol), it won't convert to float (or anything except MONEY) even if ISNUMERIC says it's a convertable number.

And, thanks for the nice compliment about the write up.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

mweaver Old Hand Points: 336 More actions · Answer 11

I think this handles all values that will convert to float (except those that use 'D' for scientific notation):

create function dbo.fn_to_number

(

@num varchar(64)

)

returns float

begin

return case

-- not empty

when len(@num) > 0

-- no illegal characters

and patindex('%[^0-9.e-]%', @num) = 0

-- at most one decimal

and charindex('.',@num, charindex('.',@num) + 1) = 0

-- if there is a decimal, it must not be first or last (e.g. ".2" and "2." are both invalid)

and (charindex('.', @num) = 0 or charindex('.',@num) between 2 and len(@num) - 1)

-- at most one "e"

and charindex('e', @num, charindex('e', @num) + 1) = 0

-- if there is an "e", it must not be first or last (e.g. "e2" and "2e" are both invalid)

and (charindex('e', @num) = 0 or charindex('e',@num) between 2 and len(@num) - 1)

-- if both decimal and "e" exist, decimal must be first

and (charindex('e', @num) = 0 or charindex('.', @num) <= charindex('e', @num))

-- at most two "-" (e.g. "-2.01e-4")

and charindex('-',@num, charindex('-',@num, charindex('-',@num) + 1) + 2) = 0

-- the first "-" must either: not exist, be first, or immediately follow an "e"

and (charindex('-',@num) <=1 or substring(@num,charindex('-',@num) - 1 ,1) = 'e')

-- the second "-" must either not exist or immediately follow an "e"

and (charindex('-', @num, charindex('-',@num) + 1) = 0 or substring(@num,charindex('-',@num, charindex('-',@num) + 1) - 1, 1) = 'e')

then

cast(@num as float)

else

null

end

go

Jeff Moden SSC Guru Points: 1004619 More actions · Answer 12

Jeff Moden (8/1/2008)
Sean Peters (7/31/2008)
No IsNumeric('11,22,33') = 1
Ummm... nope... it isn't...
SELECT ISNUMERIC('11,22,33')
-----------
0
(1 row(s) affected)
I will agree, though, that ISNUMERIC isn't the panacea that some make it out to be. If it looks like the MONEY datatype (has comma's in the correct spots or a currency symbol), it won't convert to float (or anything except MONEY) even if ISNUMERIC says it's a convertable number.
And, thanks for the nice compliment about the write up.

I ran into my own old post and thought I'd update it even though it's years late. I was still stuck on SQL Server 2000 at the time and it worked as posted. The same exact code does, in fact, produce a 1 as of SQL Server 2005.

I guess this also shows that you have to really do have to regression test to a somewhat fuller extent that you might imagine. This is why I don't have the time for cumulative updates and I'll generally skip a service pack here and there unless they have something really useful in them.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Using PATINDEX to find all numeric values

Cookies on SQLServerCentral