How to escape the ^ symbol in the following function

  • Good afternoon.

    I have the following function:

    IF EXISTS (SELECT * FROM sys.objects

    WHERE object_id = OBJECT_ID(N'[dbo].[fn_QBRemoveNonAlphaChars]')

    AND type in (N'FN', N'IF', N'TF', N'FS', N'FT'))

    DROP FUNCTION [dbo].[fn_QBRemoveNonAlphaChars]

    GO

    CREATE FUNCTION [dbo].[fn_QBRemoveNonAlphaChars](@String VARCHAR(1000))

    RETURNS VARCHAR(1000)

    AS

    BEGIN

    WHILE PATINDEX('%[^a-z ''^-]%', @String) > 0

    SET @String = STUFF(@String, PATINDEX('%[^a-z ''^-]%', @String), 1, '')

    RETURN @String

    END

    GO

    This works great most of the time, for example:

    select dbo.[fn_QBRemoveNonAlphaChars]('The~@ CA:$%&*$()T - SA0099878778747T ~~~;;;!||o*&")|n the ma@@@@@t')

    returns: The CAT - SAT on the mat which is perfect

    However, if I try it with:

    select dbo.[fn_QBRemoveNonAlphaChars]('The^~@ CA:$%&*$()T - SA0099878778747T ~~~;;;!||o*&")|n the ma@@@@@t')

    It returns: The^ CAT - SAT on the mat

    I've tried to escape it but I can't get it to work.

    Any suggestions?

    Thanks.

    ---------------------------------------------------------

    It takes a minimal capacity for rational thought to see that the corporate 'free press' is a structurally irrational and biased, and extremely violent, system of elite propaganda.
    David Edwards - Media lens[/url]

    Society has varying and conflicting interests; what is called objectivity is the disguise of one of these interests - that of neutrality. But neutrality is a fiction in an unneutral world. There are victims, there are executioners, and there are bystanders... and the 'objectivity' of the bystander calls for inaction while other heads fall.
    Howard Zinn

  • The fast way of working around it will be to modify the function's last line, so insead of the the line

    RETURN @String

    it would be

    RETURN REPLACE(@String, '^','')

    Adi

    --------------------------------------------------------------
    To know how to ask questions and increase the chances of getting asnwers:
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Total guess, no testing has gone into this: -

    CREATE FUNCTION [dbo].[fn_QBRemoveNonAlphaChars] (@String VARCHAR(1000))

    RETURNS VARCHAR(1000)

    AS

    BEGIN

    WHILE PATINDEX('%[^a-z ''-]%', @String) > 0

    SET @String = STUFF(@String, PATINDEX('%[^a-z ''-]%', @String), 1, '')

    RETURN @String

    END

    You had the ^ character in your "allowed" pattern.


    Forever trying to learn
    My blog - http://www.cadavre.co.uk/
    For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
    For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Cadavre, the ^ at the end is there to exclude the - from being removed.

    Adi, I'm happy with your solution!

    Thanks to you both.

    ---------------------------------------------------------

    It takes a minimal capacity for rational thought to see that the corporate 'free press' is a structurally irrational and biased, and extremely violent, system of elite propaganda.
    David Edwards - Media lens[/url]

    Society has varying and conflicting interests; what is called objectivity is the disguise of one of these interests - that of neutrality. But neutrality is a fiction in an unneutral world. There are victims, there are executioners, and there are bystanders... and the 'objectivity' of the bystander calls for inaction while other heads fall.
    Howard Zinn

  • Abu Dina (8/15/2012)


    Cadavre, the ^ at the end is there to exclude the - from being removed.

    Adi, I'm happy with your solution!

    Thanks to you both.

    Test mine 😉

    Your code: -

    IF EXISTS (

    SELECT *

    FROM sys.objects

    WHERE object_id = OBJECT_ID(N'[dbo].[fn_QBRemoveNonAlphaChars]')

    AND type IN (N'FN', N'IF', N'TF', N'FS', N'FT')

    )

    DROP FUNCTION [dbo].[fn_QBRemoveNonAlphaChars]

    GO

    CREATE FUNCTION [dbo].[fn_QBRemoveNonAlphaChars] (@String VARCHAR(1000))

    RETURNS VARCHAR(1000)

    AS

    BEGIN

    WHILE PATINDEX('%[^a-z ''^-]%', @String) > 0

    SET @String = STUFF(@String, PATINDEX('%[^a-z ''^-]%', @String), 1, '')

    RETURN @String

    END

    GO

    SELECT dbo.[fn_QBRemoveNonAlphaChars](

    'The^~@ CA:$%&*$()T - SA0099878778747T ~~~;;;!||o*&")|n the ma@@@@@t')

    Returns: -

    ----------------------------

    The^ CAT - SAT on the mat

    My change: -

    IF EXISTS (

    SELECT *

    FROM sys.objects

    WHERE object_id = OBJECT_ID(N'[dbo].[fn_QBRemoveNonAlphaChars]')

    AND type IN (N'FN', N'IF', N'TF', N'FS', N'FT')

    )

    DROP FUNCTION [dbo].[fn_QBRemoveNonAlphaChars]

    GO

    CREATE FUNCTION [dbo].[fn_QBRemoveNonAlphaChars] (@String VARCHAR(1000))

    RETURNS VARCHAR(1000)

    AS

    BEGIN

    WHILE PATINDEX('%[^a-z ''-]%', @String) > 0

    SET @String = STUFF(@String, PATINDEX('%[^a-z ''-]%', @String), 1, '')

    RETURN @String

    END

    GO

    SELECT dbo.[fn_QBRemoveNonAlphaChars](

    'The^~@ CA:$%&*$()T - SA0099878778747T ~~~;;;!||o*&")|n the ma@@@@@t')

    Returns: -

    ---------------------------

    The CAT - SAT on the mat


    Forever trying to learn
    My blog - http://www.cadavre.co.uk/
    For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
    For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Cadavre, my sincerest apologies! :blush:

    ---------------------------------------------------------

    It takes a minimal capacity for rational thought to see that the corporate 'free press' is a structurally irrational and biased, and extremely violent, system of elite propaganda.
    David Edwards - Media lens[/url]

    Society has varying and conflicting interests; what is called objectivity is the disguise of one of these interests - that of neutrality. But neutrality is a fiction in an unneutral world. There are victims, there are executioners, and there are bystanders... and the 'objectivity' of the bystander calls for inaction while other heads fall.
    Howard Zinn

  • Abu Dina (8/15/2012)


    Cadavre, my sincerest apologies! :blush:

    No problem. We're all here to learn and it's better to make a mistake on a forum than on production code, right? 😀


    Forever trying to learn
    My blog - http://www.cadavre.co.uk/
    For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
    For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Hear hear

    ---------------------------------------------------------

    It takes a minimal capacity for rational thought to see that the corporate 'free press' is a structurally irrational and biased, and extremely violent, system of elite propaganda.
    David Edwards - Media lens[/url]

    Society has varying and conflicting interests; what is called objectivity is the disguise of one of these interests - that of neutrality. But neutrality is a fiction in an unneutral world. There are victims, there are executioners, and there are bystanders... and the 'objectivity' of the bystander calls for inaction while other heads fall.
    Howard Zinn

  • would you expect your function to remove high ascii characters as well?

    i had posted a solution in your other thread on a similar strip function;

    what would you expect for the results of this?

    SELECT dbo.[fn_QBRemoveNonAlphaChars](

    'ÀAlbèert ËEîinstêeiìn ÌInstìitúutëe - MPG')

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Hi Lowell,

    Just got home to see your reply.

    I don't have access to my work right now but your method is just easier to understand and adapt.

    Will try again tomorrow and report back.

    Thanks for your time, much appreciated.

    Regards.

    ---------------------------------------------------------

    It takes a minimal capacity for rational thought to see that the corporate 'free press' is a structurally irrational and biased, and extremely violent, system of elite propaganda.
    David Edwards - Media lens[/url]

    Society has varying and conflicting interests; what is called objectivity is the disguise of one of these interests - that of neutrality. But neutrality is a fiction in an unneutral world. There are victims, there are executioners, and there are bystanders... and the 'objectivity' of the bystander calls for inaction while other heads fall.
    Howard Zinn

  • Abu Dina (8/15/2012)


    Hi Lowell,

    Just got home to see your reply.

    I don't have access to my work right now but your method is just easier to understand and adapt.

    Will try again tomorrow and report back.

    Thanks for your time, much appreciated.

    Regards.

    I'd like to suggest replacing your While loop driven scalar UDF with a cCTE (Cascading CTE) driven iSF (Inline Scalar Function) for the sake of performance. Here's the function I suggest... you can, of course, change the name if you decide to adopt it.

    CREATE FUNCTION dbo.isfRemoveNonAlphaChars

    (@pString VARCHAR(8000))

    RETURNS TABLE WITH SCHEMABINDING AS

    RETURN

    --===== "Inline" CTE Driven "Tally Table" produces values from 1 up to 10,000...

    -- enough to cover VARCHAR(8000)

    WITH E1(N) AS (

    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1

    ), --10E+1 or 10 rows

    E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows

    E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max

    cteTally(N) AS (--==== This provides the "base" CTE and limits the number of rows right up front

    -- for both a performance gain and prevention of accidental "overruns"

    SELECT TOP (ISNULL(DATALENGTH(@pString),0)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4

    )

    SELECT CleanString =

    (

    SELECT SUBSTRING(@pString,t.N,1)

    FROM cteTally t --No WHERE clause needed because of TOP above

    WHERE SUBSTRING(@pString,t.N,1) LIKE '[a-z ''-]'

    ORDER BY t.N

    FOR XML PATH(''), TYPE

    ).value('text()[1]', 'varchar(8000)')

    ;

    Now... let's do 2 slightly different tests. This first test "cleans" just 36 characters across just 10,000 rows.

    --===== Conditionallly drop the test table to make reruns in SSMS easier.

    IF OBJECT_ID('tempdb..#TestTable','U') IS NOT NULL DROP TABLE #TestTable;

    --===== Create and populate the test table on-the-fly.

    SELECT TOP 10000

    SomeMixedString = REPLICATE(CAST(NEWID() AS VARCHAR(50)),1)

    INTO #TestTable

    FROM sys.All_Columns ac1

    CROSS JOIN sys.All_Columns ac2

    ;

    GO

    -------------------------------------------------------------

    --===== Declare a variable to take display and disk times out of the picture

    -- and another to measure duration with. You cannot use SET STATISTICS

    -- on Scalar UDF's to measure performance because it will artificially

    -- increase the duration by many times.

    DBCC FREEPROCCACHE;

    DECLARE @Bitbucket VARCHAR(50),

    @StartTime DATETIME;

    RAISERROR('========== Original Function ==========',0,1)

    SELECT @StartTime = GETDATE();

    SELECT @Bitbucket = dbo.fn_QBRemoveNonAlphaChars(SomeMixedString)

    FROM #TestTable;

    PRINT DATEDIFF(ms,@StartTime,GETDATE());

    GO

    --===== Declare a variable to take display and disk times out of the picture

    -- and another to measure duration with. You cannot use SET STATISTICS

    -- on Scalar UDF's to measure performance because it will artificially

    -- increase the duration by many times.

    DBCC FREEPROCCACHE;

    DECLARE @Bitbucket VARCHAR(50),

    @StartTime DATETIME;

    RAISERROR('========== cCTE Function ==========',0,1)

    SELECT @StartTime = GETDATE();

    SELECT @Bitbucket = (SELECT CleanString FROM dbo.isfRemoveNonAlphaChars(SomeMixedString))

    FROM #TestTable

    --CROSS APPLY dbo.isfRemoveNonAlphaChars(SomeMixedString) c;

    PRINT DATEDIFF(ms,@StartTime,GETDATE());

    GO

    Here's what I get on my 10 year old, single 1.8GHz desktop box for durations...

    DBCC execution completed. If DBCC printed error messages, contact your system administrator.

    ========== Original Function ==========

    1383

    DBCC execution completed. If DBCC printed error messages, contact your system administrator.

    ========== cCTE Function ==========

    1180

    Now, let's try an identical test but with 4 times the width or just 144 characters wide.

    --===== Conditionallly drop the test table to make reruns in SSMS easier.

    IF OBJECT_ID('tempdb..#TestTable','U') IS NOT NULL DROP TABLE #TestTable;

    --===== Create and populate the test table on-the-fly.

    SELECT TOP 10000

    SomeMixedString = REPLICATE(CAST(NEWID() AS VARCHAR(50)),4)

    INTO #TestTable

    FROM sys.All_Columns ac1

    CROSS JOIN sys.All_Columns ac2

    ;

    GO

    -------------------------------------------------------------

    --===== Declare a variable to take display and disk times out of the picture

    -- and another to measure duration with. You cannot use SET STATISTICS

    -- on Scalar UDF's to measure performance because it will artificially

    -- increase the duration by many times.

    DBCC FREEPROCCACHE;

    DECLARE @Bitbucket VARCHAR(50),

    @StartTime DATETIME;

    RAISERROR('========== Original Function ==========',0,1)

    SELECT @StartTime = GETDATE();

    SELECT @Bitbucket = dbo.fn_QBRemoveNonAlphaChars(SomeMixedString)

    FROM #TestTable;

    PRINT DATEDIFF(ms,@StartTime,GETDATE());

    GO

    --===== Declare a variable to take display and disk times out of the picture

    -- and another to measure duration with. You cannot use SET STATISTICS

    -- on Scalar UDF's to measure performance because it will artificially

    -- increase the duration by many times.

    DBCC FREEPROCCACHE;

    DECLARE @Bitbucket VARCHAR(50),

    @StartTime DATETIME;

    RAISERROR('========== cCTE Function ==========',0,1)

    SELECT @StartTime = GETDATE();

    SELECT @Bitbucket = (SELECT CleanString FROM dbo.isfRemoveNonAlphaChars(SomeMixedString))

    FROM #TestTable

    --CROSS APPLY dbo.isfRemoveNonAlphaChars(SomeMixedString) c;

    PRINT DATEDIFF(ms,@StartTime,GETDATE());

    GO

    Here are the duration results for that run...

    DBCC execution completed. If DBCC printed error messages, contact your system administrator.

    ========== Original Function ==========

    11366

    DBCC execution completed. If DBCC printed error messages, contact your system administrator.

    ========== cCTE Function ==========

    3236

    The Scalar UDF with the While loop in it increased duration by about 7.2 times even though it was only given 3 times the amount of work.

    The cCTE Function only increased duration by about 1.7 times even though it was given 3 times the amount of work.

    If we run the same code again but 8 times as wide (288 characters) as the original, here are the run results ...

    DBCC execution completed. If DBCC printed error messages, contact your system administrator.

    ========== Original Function ==========

    40303

    DBCC execution completed. If DBCC printed error messages, contact your system administrator.

    ========== cCTE Function ==========

    5830

    If you've never worked with a "Tally CTE" (made up of Cascading CTEs) or don't know how it works to replace certain while loops, please see the following article.

    http://www.sqlservercentral.com/articles/T-SQL/62867/

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff, I'm in awe!

    Respect my good man!

    So we're rebuilding the string one character at a time and keeping only characters (LIKE '[a-z ''-]'). Maybe I'm just stupid but how do we go about replacing other characters with a blank space instead

    SELECT CleanString =

    (

    SELECT SUBSTRING(@pString,t.N,1)

    FROM cteTally t

    WHERE SUBSTRING(@pString,t.N,1) LIKE '[a-z ''-]'

    ORDER BY t.N

    FOR XML PATH(''), TYPE

    ).value('text()[1]', 'varchar(8000)')

    ---------------------------------------------------------

    It takes a minimal capacity for rational thought to see that the corporate 'free press' is a structurally irrational and biased, and extremely violent, system of elite propaganda.
    David Edwards - Media lens[/url]

    Society has varying and conflicting interests; what is called objectivity is the disguise of one of these interests - that of neutrality. But neutrality is a fiction in an unneutral world. There are victims, there are executioners, and there are bystanders... and the 'objectivity' of the bystander calls for inaction while other heads fall.
    Howard Zinn

  • Abu Dina (8/16/2012)


    Jeff, I'm in awe!

    Respect my good man!

    So we're rebuilding the string one character at a time and keeping only characters (LIKE '[a-z ''-]'). Maybe I'm just stupid but how do we go about replacing other characters with a blank space instead

    SELECT CleanString =

    (

    SELECT SUBSTRING(@pString,t.N,1)

    FROM cteTally t

    WHERE SUBSTRING(@pString,t.N,1) LIKE '[a-z ''-]'

    ORDER BY t.N

    FOR XML PATH(''), TYPE

    ).value('text()[1]', 'varchar(8000)')

    This maybe?

    CREATE FUNCTION dbo.isfRemoveNonAlphaChars

    (@pString VARCHAR(8000))

    RETURNS TABLE WITH SCHEMABINDING AS

    RETURN

    --===== "Inline" CTE Driven "Tally Table" produces values from 1 up to 10,000...

    -- enough to cover VARCHAR(8000)

    WITH E1(N) AS (

    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1

    ), --10E+1 or 10 rows

    E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows

    E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max

    cteTally(N) AS (--==== This provides the "base" CTE and limits the number of rows right up front

    -- for both a performance gain and prevention of accidental "overruns"

    SELECT TOP (ISNULL(DATALENGTH(@pString),0)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4

    )

    SELECT CleanString =

    (

    SELECT CASE WHEN SUBSTRING(@pString,t.N,1) LIKE '[a-z ''-]' THEN SUBSTRING(@pString,t.N,1) ELSE ' ' END

    FROM cteTally t --No WHERE clause needed because of TOP above

    ORDER BY t.N

    FOR XML PATH(''), TYPE

    ).value('text()[1]', 'varchar(8000)')

    ;


    Forever trying to learn
    My blog - http://www.cadavre.co.uk/
    For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
    For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Sorry

    select cleanstring from dbo.isfn_QBRemoveNonAlphaChars('Abu???////Dina')

    Gives back AbuDina so no spaces insrted. 🙁

    ---------------------------------------------------------

    It takes a minimal capacity for rational thought to see that the corporate 'free press' is a structurally irrational and biased, and extremely violent, system of elite propaganda.
    David Edwards - Media lens[/url]

    Society has varying and conflicting interests; what is called objectivity is the disguise of one of these interests - that of neutrality. But neutrality is a fiction in an unneutral world. There are victims, there are executioners, and there are bystanders... and the 'objectivity' of the bystander calls for inaction while other heads fall.
    Howard Zinn

  • Abu Dina (8/16/2012)


    Sorry

    select cleanstring from dbo.isfn_QBRemoveNonAlphaChars('Abu???////Dina')

    Gives back AbuDina so no spaces insrted. 🙁

    We've had a couple of different names for this function. There may be some confusion because the modification that Cadavre made works just fine. Make sure that you're executing the correct function.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply