Technical Article

TextToDecimal

,

SUMMARY:
This UDF script takes a text value(nvarchar) and returns a decimal(18,6) number. If the text value can't be interpreted as numeric, the UDF returns NULL.
-----------------------------------------------
USAGE:

SET @MyDecimal = dbo.TextToDecimal('-$123,456.73')
@MyDecimal will now be -123456.730000

SET @MyDecimal = dbo.TextToDecimal('-$123,4560.73') --bad number format
@MyDecimal will now be NULL
------------------------------------------------------
DESCRIPTION:

The ISNUMERIC function incorrectly returns 1 (True) for many non-numeric text values. Even worse, converting to a numeric data type by passing one of these non-numeric values to a CONVERT or CAST function will cause an untrappable error that immediately exits an SP, function, or T-SQL statement batch. Here are a few values that ISNUMERIC thinks are numbers, but will crash with CONVERT or CAST:

--anything with a dollar sign or a a comma in it.--
'-' --single non-numeric characters
'+'
','
'$'

',0' --misplaced commas
'1,'
'1,0'
'1.0,'
',1.0'
'1.0,012,'
'1.0,012,000'

'$1,00'
'$1,00.'
'$1,00.0,'
'10E2' --legacy exp formats
'10D2'

'0000000.,01' --more bizarre forms
'.,'

CREATE FUNCTION dbo.TextToDecimal  (@TextNum nvarchar(400)  ) 
--Richard Moldwin 9/16/05 
--Replacement for the very buggy ISNUMERIC and CONVERT functions 
--ISNUMERIC returns true for many non-numbers with 
--      non-CONVERT-able embedded characters like commas, "-", "+", and "$", etc, 
--      and also returns true for single characters that are definitely not numbers like: , # $ + - 
--CONVERT produces non-trappable errors when "$", commas, leading + sign 
--      and related formats are present in number strings. 
--      Since the errors are non-trappable (they abort the entire T-SQL statement batch) , we can't work around them with T-SQL code.

--This function fixes all of the above problems, and: 
--Accepts a text string of up to 400 Unicode characters. 
--Returns a decimal number in 18,6 format 
--      (precision = 18, scale = 6) from a input string: (123456789012.123456) 
--Returns NULL if the string can't be converted into a decimal format 
--The input @TextNum will be: 
        --Trimmed to remove leading and trailing spaces 
        --stripped of legal starting characters 
        --checked for illegal non-numeric characters 
        --checked for duplicate decimal points 
        --checked for commas in illegal positions 
        --stripped of all commas 
        --checked to ensure that the maximum integer size is small enough to be converted to (18, 6) format 
--This function: 
--does not convert scientific notation (e.g. 10E4) at this time, 
--      because conversion to the resultant decimal could easily 
--      produce overflow errors even with the maximum current fixed decimal(38)  format. 

--                      Commas 
--Any leading comma (before the first digit) is illegal and produces a non-numeric string and a NULL output 
--Commas not obeying the every-third digit rule is illegal, and will produce a non-numeric string and a NULL output 
--Commas after a decimal point are illegal and will produce a non-numeric string and a NULL output 
--In a string with otherwise valid commas, zeros may be used to pad the leading digit up to 3 digits:  
--      001,100 and 01,100, 000,000,001 and 1,100 are all valid. 
--      More than 3 leading zeros when commas are present (e.g. 0001,000 or 0123,000,000) are always illegal, 
--              and will produce a non-numeric string and a NULL output 

--Leading and trailing spaces are stripped; embedded spaces produce a non-numeric string and a NULL output 
--Leading $, -$ or +$ characters are allowed. 
--Single leading + or - is allowed. 
--Only a single decimal point is allowed. 

--To fit Decimal(18, 6) format: 
--      No more that 12 digits (18 - 6 = 12) to the left of the decimal are allowed; if there are more than 12, we return NULL

--      if there are more than 6 digits to the right of the decimal point, the extra digits will be rounded and truncated by the CONVERT function.

--      (Decimal(18, 6) may be changed to another convenient format, as desired.) 

RETURNS decimal(18, 6) --although we can easily support the maximum decimal(38) size (e.g. (38, 12) ) , this version will support only Decimal(18,6) 

BEGIN  

DECLARE @Neg AS nvarchar(1) --flag for a negative number 
DECLARE @TextLen AS  int  --length of the current working input text value 

--strip leading and trailing spaces 
SET @TextNum = LTRIM(RTRIM(@TextNum)) 
SET @TextLen = LEN(@TextNum) 
        --abort for NULL text or illegal single characters 
        IF @TextLen = 0 RETURN NULL 
        IF @TextLen = 1 
                IF @TextNum LIKE '[^0-9]'  RETURN NULL    --Don't allow non-numeric single characters. 

--strip off leading + or - or $ or -$ or +$ 
IF LEFT(@TextNum, 1) = '-'  
        BEGIN 
                SET @TextNum = SUBSTRING(@TextNum, 2, @TextLen)  
                SET @Neg = '-' --flag for a negative number 
                SET @TextLen = LEN(@TextNum) 
        END 
ELSE    
        BEGIN 
                SET @Neg = '' 
                IF LEFT(@TextNum, 1) = '+'  
                        BEGIN 
                                SET @TextNum = SUBSTRING(@TextNum, 2, @TextLen) 
                                SET @TextLen = LEN(@TextNum) 
                        END 
        END 

IF LEFT(@TextNum, 1) = '$'  
        BEGIN 
                SET @TextNum = SUBSTRING(@TextNum, 2, @TextLen) 
                SET @TextLen = LEN(@TextNum) 
        END 

--Now that we have removed any leading characters: +   -   $   -$   +$ 
--we can check for any character that is not a digit or comma, or decimal point 
IF PATINDEX('%[^0-9,.]%' ,@TextNum) >0 --non-numeric character (other than decimal point or comma) in any position 
        BEGIN 
                RETURN  NULL 
        END 

------------Decimal & Comma handling section--------------------------------------------------------------------------------------------------

--Here we see if any commas or decimal points are present or misplaced: 
        
        DECLARE @CommaPos AS TinyInt 
        DECLARE @CommaPosOld AS TinyInt 
        DECLARE @LastCommaPos AS TinyInt 
        DECLARE @NoCommaNum nVARCHAR(64) --@num with all commas removed 
        DECLARE @NumberOfCommas AS  int 
        DECLARE @LastDecimalPos AS TinyInt  --position of last decimal point 
        DECLARE @TextInt AS nvarchar(64)  --Integer version of @TextNum 
        
         --get position of first comma in string 
        SET @CommaPos = CHARINDEX(',',@TextNum) 
                IF @CommaPos = @TextLen RETURN NULL --last character is a comma 
                IF @CommaPos = 1  RETURN NULL --first character is a comma 

          --find the position of the LAST decimal point in a string 
        SET @LastDecimalPos = 1 +@TextLen - CHARINDEX('.', REVERSE(@TextNum)) 
                --Make sure we have no more than 1 decimal point 
                IF NOT( @LastDecimalPos = @TextLen + 1  --Postition of the right-most decimal 
                        OR @LastDecimalPos = CHARINDEX('.', @TextNum)) --Position of right-most decimal point equals position of left-most  decimal point; i.e there is only one decimal point

                RETURN NULL 
        
          --find the position of the right-most comma, and ensure it occurs to the left of the decimal point 
        SET @LastCommaPos =  1 +@TextLen- CHARINDEX(',', REVERSE(@TextNum)) 
        IF @LastCommaPos <= @TextLen  --we have found a comma 
                IF @LastCommaPos > @LastDecimalPos RETURN NULL --comma present after decimal, so this is not a proper number

        
        --Get the text to the left of the decimal (the integer part):  this is the only place where we will allow commas.

        SET @TextInt = SUBSTRING(@TextNum, 1, @LastDecimalPos-1) 
        IF LEN( REPLACE(@TextInt, ',', '')) >12 RETURN NULL --largest integer part we will handle is 12 digits for decimal (18,6 format)

        --This is to prevent CONVERT errors below, when the integer part has more than 12 digits. 
        --also, digits after the decimal point will be truncated to 6 digits for our preferred decimal (18, 6) format. 
        
        --now we find the last comma position in the integer part of the text (BEFORE the last decimal point) : (e.g. produces the value '7' for 1,1bc3,456.001,0)

        SET @LastCommaPos =  1 + LEN(@TextInt) - CHARINDEX(',', REVERSE(@TextInt)) 
        
        DECLARE @IntLen AS int --length of the integer part of the text 
        DECLARE @Result AS int --search results when looking for valid characters. 
        
        SET @IntLen = LEN(@TextInt) 

         --IF at least one comma is present, make sure commas are in the correct pattern 
        IF @LastCommaPos  < @IntLen + 1 
        BEGIN  --start of long section for handling commas 
                SELECT @Result= 
                CASE @IntLen 
                        WHEN 3 THEN 
                                PATINDEX('[0-9][0-9][0-9]', @TextInt) --3 digits, no commas allowed             
                         WHEN 2 THEN 
                                PATINDEX('[0-9][0-9]', @TextInt) --2 digits, no commas allowed 
                         WHEN 1 THEN 
                                PATINDEX('[0-9]', @TextInt) --1 digit, no commas allowed 
                        ELSE 
                                PATINDEX('%,[0-9][0-9][0-9]', @TextInt) 
                END 
                if @Result = 0          RETURN NULL 
        
                WHILE @IntLen >3 --while a comma is present 
                --We only allow commas at 3 digit intervals; if comma is out of place, we have an illegal number 
                        BEGIN 
                                SET @TextInt = SUBSTRING(@TextInt, 1, @LastCommaPos-1) --lop off the right side of the integer text at the position of the last comma

                                SET @IntLen = LEN(@TextInt) 
                                SET @LastCommaPos =  1 +@IntLen - CHARINDEX(',', REVERSE(@TextInt))  
                                SELECT @Result= 
                                CASE @IntLen 
                                        WHEN 3 THEN 
                                                PATINDEX('[0-9][0-9][0-9]', @TextInt) --3 digits, no commas allowed 
                                         WHEN 2 THEN 
                                                PATINDEX('[0-9][0-9]', @TextInt) --2 digits, no commas allowed 
                                         WHEN 1 THEN 
                                                PATINDEX('[0-9]', @TextInt) --1 digit, no commas allowed 
                                        ELSE 
                                                PATINDEX('%,[0-9][0-9][0-9]', @TextInt) --comma allowed just before the right-most 3 digits

                                END 
                                IF @Result = 0          RETURN NULL 
                        END 
        
                --Now strip out all the commas 
                SET @TextNum =REPLACE(@TextNum, ',','') 
        END 

-----------End of Decimal & Comma handling section ----------------------------------------------------------------------

        --At this point, the input text (@TextNum) has been: 
                --Trimmed to remove leading and trailing spaces 
                --stripped of legal starting characters 
                --checked for illegal non-numeric characters 
                --checked for duplicate decimal points 
                --checked for commas in illegal positions, and 
                --stripped of all commas 
                --checked to ensure that the maximum integer size is small enough to be converted to (18, 6) format 
        --If we got to this point, the CONVERT function should never produce errors! 
        RETURN   CONVERT(decimal(18, 6), @Neg + @TextNum)-- 

END

Rate

5 (1)

You rated this post out of 5. Change rating

Share

Share

Rate

5 (1)

You rated this post out of 5. Change rating