Extracting Numbers from String

  • Do we have any in-built sql function that gives us numeric values in a string?

    I have to deal with some inconsistent US phone numbers stored in DB. They are stored as

    (xxx)xxx-xxxx

    xxx xxxx xxxx

    (xxx) xxx-xxxx

    xxx-xxx-xxxx

    xxxxxxxxxx

    I don't want to apply nested REPLACE function to eliminate all unnecessary characters to get 10 digit number from DB.

    Thanks for your inputs.

  • I know of any simple way to do what you're asking without using REPLACE.

    FYI, using REPLACE with the format you gave looks pretty simple:

    REPLACE(REPLACE(REPLACE(REPLACE(PhoneNumber, '(', ''), ')', ''), '-', ''), ' ', '')



    Alvin Ramard
    Memphis PASS Chapter[/url]

    All my SSC forum answers come with a money back guarantee. If you didn't like the answer then I'll gladly refund what you paid for it.

    For best practices on asking questions, please read the following article: Forum Etiquette: How to post data/code on a forum to get the best help[/url]

  • Alvin Ramard (6/2/2015)


    I know of any simple way to do what you're asking without using REPLACE.

    FYI, using REPLACE with the format you gave looks pretty simple:

    REPLACE(REPLACE(REPLACE(REPLACE(PhoneNumber, '(', ''), ')', ''), '-', ''), ' ', '')

    Thanks Alvin. I had implemented this nested replace already but thought of checking with people here for an efficient solution, if exists. Would it be somehow possible to use Regular Expression replace in SQL to make it work? The reason to look for alternative is I'm only worried if there is any other format hidden in the data which I have not seen. Using REPLACE makes me force to change my logic every time I find a new format. I agree with you on the point that it is quite straight-forward and simple!!

  • take a look at this thread from a while ago: it's got some excellent examples on stripping out non-numeric characters in the fastest way possible.

    do you have a StripNonNumeric ITVF function?

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • A scalar function will handle that just fine. Particularly if you do it only once, when the data is inserted. You can add a format code / pattern if you really need the data to come out in exactly the same format in which it came in.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • ScottPletcher (6/2/2015)


    A scalar function will handle that just fine. Particularly if you do it only once, when the data is inserted. You can add a format code / pattern if you really need the data to come out in exactly the same format in which it came in.

    We're not talking about formatting data as it is entered. We're talking about dealing with data that already has been entered in various formats.



    Alvin Ramard
    Memphis PASS Chapter[/url]

    All my SSC forum answers come with a money back guarantee. If you didn't like the answer then I'll gladly refund what you paid for it.

    For best practices on asking questions, please read the following article: Forum Etiquette: How to post data/code on a forum to get the best help[/url]

  • Regex sounds like a great idea, but I'm not aware of any way to use it without creating a CLR function.



    Alvin Ramard
    Memphis PASS Chapter[/url]

    All my SSC forum answers come with a money back guarantee. If you didn't like the answer then I'll gladly refund what you paid for it.

    For best practices on asking questions, please read the following article: Forum Etiquette: How to post data/code on a forum to get the best help[/url]

  • Alvin Ramard (6/2/2015)


    ScottPletcher (6/2/2015)


    A scalar function will handle that just fine. Particularly if you do it only once, when the data is inserted. You can add a format code / pattern if you really need the data to come out in exactly the same format in which it came in.

    We're not talking about formatting data as it is entered. We're talking about dealing with data that already has been entered in various formats.

    Not necessarily. We could just fully solve the problem by removing them once at entry instead of having to do it every time the data is processed. Stop letting garbage in to begin with instead of having to continue to clean it up over and over and over later. Thousands (or more?) of nonnumeric char parsing instead of just once at entry.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • ScottPletcher (6/2/2015)


    Alvin Ramard (6/2/2015)


    ScottPletcher (6/2/2015)


    A scalar function will handle that just fine. Particularly if you do it only once, when the data is inserted. You can add a format code / pattern if you really need the data to come out in exactly the same format in which it came in.

    We're not talking about formatting data as it is entered. We're talking about dealing with data that already has been entered in various formats.

    Not necessarily. We could just fully solve the problem by removing them once at entry instead of having to do it every time the data is processed. Stop letting garbage in to begin with instead of having to continue to clean it up over and over and over later. Thousands (or more?) of nonnumeric char parsing instead of just once at entry.

    I don't disagree with you about dealing with data being entered, but it is of no help in dealing with what is already entered.



    Alvin Ramard
    Memphis PASS Chapter[/url]

    All my SSC forum answers come with a money back guarantee. If you didn't like the answer then I'll gladly refund what you paid for it.

    For best practices on asking questions, please read the following article: Forum Etiquette: How to post data/code on a forum to get the best help[/url]

  • Lowell (6/2/2015)


    take a look at this thread from a while ago: it's got some excellent examples on stripping out non-numeric characters in the fastest way possible.

    do you have a StripNonNumeric ITVF function?

    I went looking for this but you beat me to it, haven't found anything better than the code from that thread

    😎

  • Well, Yes, I agree!! I am working on data migration where I saw this pattern. I'm fixing this to have clean data in my new DB. I would like to leave it to front-end to display the format they like. In my case, it is just once off to read from existing DB, make it clean and store in my DB.

  • For now you could create a non-persisted computed column you could reference in queries to get the "clean" number. Later you could make it an actual column by doing the initial setting of it for existing rows and allowing a trigger to set the value in the future.

    First, create the function below, and then add the column to the table like below:

    ALTER TABLE table_name

    ADD phone_number_clean AS dbo.StripNonnumericChars(phone_number);

    SET ANSI_NULLS ON

    SET QUOTED_IDENTIFIER ON

    GO

    CREATE FUNCTION [dbo].[StripNonnumericChars] (

    @string varchar(200)

    )

    RETURNS varchar(200)

    WITH RETURNS NULL ON NULL INPUT

    AS

    BEGIN

    DECLARE @byte int

    WHILE 1 = 1

    BEGIN

    SET @byte = PATINDEX('%[^0-9]%', @string)

    IF @byte = 0

    BREAK

    SET @string = STUFF(@string, @byte, 1, '')

    END --WHIILE

    RETURN @string

    END --FUNCTION

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • Eirikur Eiriksson (6/2/2015)


    Lowell (6/2/2015)


    take a look at this thread from a while ago: it's got some excellent examples on stripping out non-numeric characters in the fastest way possible.

    do you have a StripNonNumeric ITVF function?

    I went looking for this but you beat me to it, haven't found anything better than the code from that thread

    😎

    i only started the ball rolling, you and several others took the performance to new heights on that one. i've got a number of scripts harvested form that awesome thread.

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Lowell (6/2/2015)


    Eirikur Eiriksson (6/2/2015)


    Lowell (6/2/2015)


    take a look at this thread from a while ago: it's got some excellent examples on stripping out non-numeric characters in the fastest way possible.

    do you have a StripNonNumeric ITVF function?

    I went looking for this but you beat me to it, haven't found anything better than the code from that thread

    😎

    i only started the ball rolling, you and several others took the performance to new heights on that one. i've got a number of scripts harvested form that awesome thread.

    I guess it would be reasonably easy to modify the script to remove all non-printable characters by changing the range of ASCII code.



    Alvin Ramard
    Memphis PASS Chapter[/url]

    All my SSC forum answers come with a money back guarantee. If you didn't like the answer then I'll gladly refund what you paid for it.

    For best practices on asking questions, please read the following article: Forum Etiquette: How to post data/code on a forum to get the best help[/url]

  • Alvin Ramard (6/2/2015)


    Lowell (6/2/2015)


    Eirikur Eiriksson (6/2/2015)


    Lowell (6/2/2015)


    take a look at this thread from a while ago: it's got some excellent examples on stripping out non-numeric characters in the fastest way possible.

    do you have a StripNonNumeric ITVF function?

    I went looking for this but you beat me to it, haven't found anything better than the code from that thread

    😎

    i only started the ball rolling, you and several others took the performance to new heights on that one. i've got a number of scripts harvested form that awesome thread.

    I guess it would be reasonably easy to modify the script to remove all non-printable characters by changing the range of ASCII code.

    IIRC at least some of the code in that thread does that, otherwise you are right, it should be easy.

    😎

Viewing 15 posts - 1 through 15 (of 19 total)

You must be logged in to reply to this topic. Login to reply