do you have a StripNonNumeric ITVF function?

  • Eirikur Eiriksson (11/4/2014)


    Just dawned on me, kind of a deja vu, here is around six years old thread on the same subject.

    😎

    Heh... after 6 years, we might finally be on to something. 😀

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • If you want a major duration boost, you can always slip Adam Machanic's "make_parallel" function into the mix and then you get some serious results...

    Adam's function : http://sqlblog.com/blogs/adam_machanic/archive/2013/07/11/next-level-parallel-plan-porcing.aspx

    In use for this test harness:

    SELECT

    @string = x.DigitsOnly

    FROM dbo.make_parallel() AS mp

    CROSS APPLY

    (

    SELECT DigitsOnly

    FROM #val

    cross apply dbo.DigitsOnlyEE(txt)

    )x;

    Using this, I get parallel execution and duration drops to ~500 ms for the EE/JM function.

    MM



    select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);

  • Forum Etiquette: How to post Reporting Services problems
  • [/url]
  • Forum Etiquette: How to post data/code on a forum to get the best help - by Jeff Moden
  • [/url]
  • How to Post Performance Problems - by Gail Shaw
  • [/url]

  • mister.magoo (11/4/2014)


    If you want a major duration boost, you can always slip Adam Machanic's "make_parallel" function into the mix and then you get some serious results...

    Adam's function : http://sqlblog.com/blogs/adam_machanic/archive/2013/07/11/next-level-parallel-plan-porcing.aspx

    In use for this test harness:

    SELECT

    @string = x.DigitsOnly

    FROM dbo.make_parallel() AS mp

    CROSS APPLY

    (

    SELECT DigitsOnly

    FROM #val

    cross apply dbo.DigitsOnlyEE(txt)

    )x;

    Using this, I get parallel execution and duration drops to ~500 ms for the EE/JM function.

    +1000 Absolutely Brilliant Sir. I ran accross Adam's make_parallel() some time ago and wish I thought of this.

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • Eirikur Eiriksson (11/3/2014)


    Adding a little bit of tinkering based on the operator cost results in the previous post, mainly changing from the CASE to a WHERE clause for filtering. Not a big gain but some.

    😎

    USE tempdb;

    GO

    SET NOCOUNT ON;

    IF OBJECT_ID('temp1.dbo.PatEExclude8K') IS NOT NULL DROP FUNCTION dbo.PatEExclude8K;

    IF OBJECT_ID('temp1.dbo.PatExclude8K') IS NOT NULL DROP FUNCTION dbo.PatExclude8K;

    GO

    CREATE FUNCTION dbo.PatEExclude8K

    (

    @String VARCHAR(8000),

    @Pattern VARCHAR(50)

    )

    RETURNS TABLE WITH SCHEMABINDING AS

    RETURN

    WITH

    E1(N) AS (SELECT N FROM (VALUES (NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL)) AS X(N)),

    Itally(N) AS

    (

    SELECT TOP(CONVERT(INT,LEN(@String),0)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

    FROM E1 T1 CROSS JOIN E1 T2 CROSS JOIN E1 T3 CROSS JOIN E1 T4

    )

    SELECT

    (

    SELECT SUBSTRING(@String,N,1)

    FROM iTally

    WHERE 0 = PATINDEX(@Pattern,SUBSTRING(@String COLLATE Latin1_General_BIN,N,1))

    FOR XML PATH('')

    ) AS NewString;

    GO

    CREATE FUNCTION dbo.PatExclude8K

    (

    @String VARCHAR(8000),

    @Pattern VARCHAR(50)

    )

    RETURNS TABLE WITH SCHEMABINDING AS

    RETURN

    WITH

    E1(N) AS (SELECT N FROM (VALUES (NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL)) AS X(N)),

    Itally(N) AS

    (

    SELECT TOP(CONVERT(BIGINT,LEN(@String),0)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

    FROM E1 T1 CROSS JOIN E1 T2 CROSS JOIN E1 T3 CROSS JOIN E1 T4

    )

    SELECT

    (

    SELECT CASE WHEN PATINDEX(@Pattern,SUBSTRING(@String COLLATE Latin1_General_BIN,N,1)) = 0 THEN SUBSTRING(@String,N,1) END

    FROM iTally

    FOR XML PATH('')

    ) AS NewString;

    GO

    -- make sure the new version works

    --SELECT * FROM dbo.PatExclude8K('123acb456!','[^0-9!]')

    --SELECT * FROM dbo.PatEExclude8K('123acb456!','[^0-9!]')

    --===== Create the 100K row test table

    IF OBJECT_ID('tempdb..#val') IS NOT NULL DROP TABLE #val;

    SELECT TOP 100000

    txt = ISNULL(CONVERT(VARCHAR(36),NEWID()),'')

    INTO #val

    FROM sys.all_columns ac1

    CROSS JOIN sys.all_columns ac2;

    ALTER TABLE #Val

    ADD PRIMARY KEY CLUSTERED (txt);

    --===== Do the tests. Had to use duration because one

    -- of the tests is on the new scalar function and

    -- SET STATISTICS doesn't report on those correctly.

    GO

    PRINT '========== Using PatEExclude8K ==========';

    DECLARE @String VARCHAR(36)

    ,@StartTime DATETIME;

    SELECT @StartTime = GETDATE();

    SELECT @string = newstring

    FROM #val

    CROSS APPLY dbo.PatEExclude8K(txt, '[^0-9]');

    PRINT DATEDIFF(ms,@StartTime,GETDATE());

    GO 3

    PRINT '========== Using PatExclude8K ==========';

    DECLARE @String VARCHAR(36)

    ,@StartTime DATETIME;

    SELECT @StartTime = GETDATE();

    SELECT @string = newstring

    FROM #val

    CROSS APPLY dbo.PatExclude8K(txt, '[^0-9]');

    PRINT DATEDIFF(ms,@StartTime,GETDATE());

    GO 3

    Results

    Beginning execution loop

    ========== Using PatEExclude8K ==========

    1450

    ========== Using PatEExclude8K ==========

    1470

    ========== Using PatEExclude8K ==========

    1496

    Batch execution completed 3 times.

    Beginning execution loop

    ========== Using PatExclude8K ==========

    1586

    ========== Using PatExclude8K ==========

    1563

    ========== Using PatExclude8K ==========

    1556

    Batch execution completed 3 times.

    Well done again Eirikur. I'm going to test this out and update my script. Shortly.

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • Eirikur Eiriksson (11/4/2014)


    Just dawned on me, kind of a deja vu, here is around six years old thread on the same subject.

    😎

    I read through this thread - very interesting...

    Got me thinking - why not test this with a CLR.:hehe: I also ran patexclude8k() and patEExclude8k() with make_parallel().

    Notes:

    1) I am running this on my laptop (8GB ram, 2 cores, 4 logical Windows 8) so my results will be a little slower.

    2) I am using the mdq.regexreplace, the regexreplace CLR that ships with MDS/DQS. I'm sure there's faster ones, this one is just the easiest to setup.

    3) I tested the CLR against make_parallel() but excluded the results for brevity (they were bad)

    Code:

    /****** Binary code trunctated. Instructions to get the code can be found here: http://www.sqlservercentral.com/articles/MDS/75932/ ******/

    CREATE ASSEMBLY [Microsoft.MasterDataServices.DataQuality]

    FROM 0x4D5A90000300000004000000FFF...

    WITH PERMISSION_SET = SAFE

    GO

    CREATE SCHEMA [mdq] AUTHORIZATION [dbo]

    GO

    /*

    mdq.RegexReplace() modified to remove non-numeric characters

    notes:

    1. nvarchar is required;

    2. I changed the @input from nvarchar(4000) to nvarchar(200), @pattern to nvarchar(50), and @REPLACE to nvarchar(1)

    for a more "apples-to-apples" comparison

    */

    CREATE FUNCTION [mdq].[RegexReplace](@input [nvarchar](200), @pattern [nvarchar](50), @REPLACE [nvarchar](1), @mask [tinyint])

    RETURNS [nvarchar](4000) WITH EXECUTE AS CALLER, RETURNS NULL ON NULL INPUT

    AS

    EXTERNAL NAME [Microsoft.MasterDataServices.DataQuality].[Microsoft.MasterDataServices.DataQuality.SqlClr].[RegexReplace]

    GO

    PRINT '========== Using PatExclude8K ==========';

    DECLARE @String VARCHAR(36)

    ,@StartTime DATETIME;

    SELECT @StartTime = GETDATE();

    SELECT @string = newstring

    FROM #val

    CROSS APPLY dbo.PatExclude8K(txt, '[^0-9]');

    PRINT DATEDIFF(ms,@StartTime,GETDATE());

    GO 3

    PRINT '========== Using PatEExclude8K ==========';

    DECLARE @String VARCHAR(36)

    ,@StartTime DATETIME;

    SELECT @StartTime = GETDATE();

    SELECT @string = newstring

    FROM #val

    CROSS APPLY dbo.PatEExclude8K(txt, '[^0-9]');

    PRINT DATEDIFF(ms,@StartTime,GETDATE());

    GO 3

    PRINT '========== mdq.RegexReplace ==========';

    DECLARE @String VARCHAR(36)

    ,@StartTime DATETIME;

    SELECT @StartTime = GETDATE();

    SELECT @string = mdq.RegexReplace(txt, '[^0-9]', '', 0)

    FROM #val

    PRINT DATEDIFF(ms,@StartTime,GETDATE());

    GO 3

    -----------------------------------------------------------

    PRINT '========== Using PatExclude8K with make_parallel() ==========';

    DECLARE @String VARCHAR(36)

    ,@StartTime DATETIME;

    SELECT @StartTime = GETDATE();

    SELECT @string = newstring

    FROM #val

    CROSS APPLY dbo.PatExclude8K(txt, '[^0-9]')

    CROSS APPLY dbo.make_parallel();

    PRINT DATEDIFF(ms,@StartTime,GETDATE());

    GO 3

    PRINT '========== Using PatEExclude8K with make_parallel() ==========';

    DECLARE @String VARCHAR(36)

    ,@StartTime DATETIME;

    SELECT @StartTime = GETDATE();

    SELECT @string = newstring

    FROM #val

    CROSS APPLY dbo.PatEExclude8K(txt, '[^0-9]')

    CROSS APPLY dbo.make_parallel();

    PRINT DATEDIFF(ms,@StartTime,GETDATE());

    GO 3

    PRINT '========== mdq.RegexReplace with make_parallel() ==========';

    DECLARE @String VARCHAR(36)

    ,@StartTime DATETIME;

    SELECT @StartTime = GETDATE();

    SELECT @string = mdq.RegexReplace(txt, '[^0-9]', '', 0)

    FROM #val

    CROSS APPLY dbo.make_parallel();

    PRINT DATEDIFF(ms,@StartTime,GETDATE());

    GO 3

    Results:

    Beginning execution loop

    ========== Using PatExclude8K ==========

    2116

    ========== Using PatExclude8K ==========

    2140

    ========== Using PatExclude8K ==========

    2186

    Batch execution completed 3 times.

    Beginning execution loop

    ========== Using PatEExclude8K ==========

    2083

    ========== Using PatEExclude8K ==========

    2050

    ========== Using PatEExclude8K ==========

    2050

    Batch execution completed 3 times.

    Beginning execution loop

    ========== mdq.RegexReplace ==========

    1103

    ========== mdq.RegexReplace ==========

    1093

    ========== mdq.RegexReplace ==========

    1136

    Batch execution completed 3 times.

    Beginning execution loop

    ========== Using PatExclude8K with make_parallel() ==========

    1116

    ========== Using PatExclude8K with make_parallel() ==========

    1153

    ========== Using PatExclude8K with make_parallel() ==========

    1123

    Batch execution completed 3 times.

    Beginning execution loop

    ========== Using PatEExclude8K with make_parallel() ==========

    1180

    ========== Using PatEExclude8K with make_parallel() ==========

    1106

    ========== Using PatEExclude8K with make_parallel() ==========

    1103

    Batch execution completed 3 times.

    The CLR was looking like the clear winner until trying magoo's idea to use make_parallel() against the patexclude8K functions.

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • Jeff Moden (11/4/2014)


    Eirikur Eiriksson (11/4/2014)


    Just dawned on me, kind of a deja vu, here is around six years old thread on the same subject.

    😎

    Heh... after 6 years, we might finally be on to something. 😀

    On that note, one doesn't count the time it takes to make parmegiana reggiano...:-P

    😎

  • Taking this one step further... Why not just replace PatExclude8K with PatReplace8k() ?:w00t:

    I had to use logic from earlier in the thread but what about this for starters:

    CREATE FUNCTION dbo.PatReplace8K

    (

    @String VARCHAR(8000),

    @Pattern VARCHAR(50),

    @Replace VARCHAR(1)

    )

    RETURNS TABLE WITH SCHEMABINDING

    AS

    RETURN

    WITH E1(N) AS (SELECT N FROM (VALUES (NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL)) AS E1(N)),

    Tally(N) AS (SELECT TOP (LEN(@String)) CHECKSUM(ROW_NUMBER() OVER (ORDER BY (SELECT NULL))) FROM E1 a,E1 b,E1 c,E1 d)

    SELECT NewString =

    CAST

    (

    (

    SELECT CASE

    WHEN PATINDEX(@Pattern,SUBSTRING(@String COLLATE Latin1_General_BIN,N,1)) = 0

    THEN SUBSTRING(@String,N,1)+''

    ELSE @replace+''

    END

    FROM Tally

    FOR XML PATH(''), TYPE

    )

    AS varchar(8000)

    );

    GO

    You still have your patexclude() functionality:

    -- replacing patexclude8k with patReplace8k()

    -- extract a phone number

    SELECT *

    FROM dbo.PatReplace8K('Call me at: 555-123-9999','[^0-9-]','')

    And you have some new functionality:

    -- a T-SQL only alternative to RegexReplace

    SELECT *

    FROM dbo.PatReplace8K('my credit card number is: 5555-1234-7777-8888','[0-9]','*')

    Edit: text and code formatting

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • Alan.B (11/4/2014)


    Taking this one step further... Why not just replace PatExclude8K with PatReplace8k() ?:w00t:

    I like the idea, tinkered a little bit with the code and got the average execution time on my workbench from 4430 to 3650.

    😎

    IF OBJECT_ID('dbo.PatREEplace8K') IS NOT NULL DROP FUNCTION dbo.PatREEplace8K;

    GO

    CREATE FUNCTION dbo.PatREEplace8K

    (

    @String VARCHAR(8000),

    @Pattern VARCHAR(50),

    @Replace VARCHAR(1)

    )

    RETURNS TABLE WITH SCHEMABINDING

    AS

    RETURN

    WITH E1(N) AS (SELECT N FROM (VALUES (NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL)) AS E1(N))

    ,Tally(N) AS (SELECT TOP (LEN(@String)) CHECKSUM(ROW_NUMBER() OVER (ORDER BY (SELECT NULL))) FROM E1 a,E1 b,E1 c,E1 d)

    SELECT NewString =

    (

    (

    SELECT STUFF(SUBSTRING(@String,NM.N,1),1,PATINDEX(@Pattern,SUBSTRING(@String COLLATE Latin1_General_BIN,N,1)),@Replace)

    FROM Tally NM

    FOR XML PATH('')

    )

    );

    GO

    Edit: not working

  • Eirikur Eiriksson (11/6/2014)


    Alan.B (11/4/2014)


    Taking this one step further... Why not just replace PatExclude8K with PatReplace8k() ?:w00t:

    I like the idea, tinkered a little bit with the code and got the average execution time on my workbench from 4430 to 3650.

    😎

    IF OBJECT_ID('dbo.PatREEplace8K') IS NOT NULL DROP FUNCTION dbo.PatREEplace8K;

    GO

    CREATE FUNCTION dbo.PatREEplace8K

    (

    @String VARCHAR(8000),

    @Pattern VARCHAR(50),

    @Replace VARCHAR(1)

    )

    RETURNS TABLE WITH SCHEMABINDING

    AS

    RETURN

    WITH E1(N) AS (SELECT N FROM (VALUES (NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL)) AS E1(N))

    ,Tally(N) AS (SELECT TOP (LEN(@String)) CHECKSUM(ROW_NUMBER() OVER (ORDER BY (SELECT NULL))) FROM E1 a,E1 b,E1 c,E1 d)

    SELECT NewString =

    (

    (

    SELECT STUFF(SUBSTRING(@String,NM.N,1),1,PATINDEX(@Pattern,SUBSTRING(@String COLLATE Latin1_General_BIN,N,1)),@Replace)

    FROM Tally NM

    FOR XML PATH('')

    )

    );

    GO

    :ermm:

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • :blush: just noticed, had several "versions", this one is no good

    😎

  • Eirikur Eiriksson (11/6/2014)


    :blush: just noticed, had several "versions", this one is no good

    😎

    This guy here is slightly slower than the last (and fastest) version of PatExclude8K that you posted.

    ALTER FUNCTION dbo.PatReplace8K

    (

    @String VARCHAR(8000),

    @Pattern VARCHAR(50),

    @Replace CHAR(1)

    )

    RETURNS TABLE WITH SCHEMABINDING

    AS

    RETURN

    WITH E1(N) AS (SELECT N FROM (VALUES (NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL)) AS E1(N)),

    Tally(N) AS (SELECT TOP (LEN(@String)) CHECKSUM(ROW_NUMBER() OVER (ORDER BY (SELECT NULL))) FROM E1 a,E1 b,E1 c,E1 d)

    SELECT NewString =

    (

    SELECT CASE

    WHEN PATINDEX(@Pattern,SUBSTRING(@String COLLATE Latin1_General_BIN,N,1)) = 0

    THEN SUBSTRING(@String,N,1)

    ELSE

    CASE @Replace WHEN '' THEN '' ELSE @Replace END-- this line seems silly but really is required (trust me!)

    END +''

    FROM Tally

    FOR XML PATH('')

    );

    GO

    So here's a question...

    Let's say the best we could do is create a PatReplace function that is 10% slower than PatExclude... Would you just use PatReplace and take the minor performance hit so you could have just one function? OR would you create both and recommend that people use PatExclude for stripping characters and patReplace for replacing characters? I am going to add this to my toolbox and was looking for a second opition...

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • Alan.B (11/6/2014)


    Eirikur Eiriksson (11/6/2014)


    :blush: just noticed, had several "versions", this one is no good

    😎

    This guy here is slightly slower....

    Yes, stay away from the red one:-D

    So here's a question...

    Let's say the best we could do is create a PatReplace function that is 10% slower than PatExclude... Would you just use PatReplace and take the minor performance hit so you could have just one function? OR would you create both and recommend that people use PatExclude for stripping characters and patReplace for replacing characters? I am going to add this to my toolbox and was looking for a second opition...

    I'll go with both for certain, in fact there are at least five functions coming out of this effort, CleanNum, CleanAlpha, PatInclude, PatExclude and PatReplace.

    😎

  • Contributing a couple of observations:

    To guarantee correct concatenation order in the FOR XML PATH, an ORDER BY is required (credit: Mikael Eriksson).

    Equally, the TOP in the Tally CTE technically needs an ORDER BY as well. Without that, SQL Server would be quite within its rights to return any numbered rows from the set, in any order.

    Both correctness enhancements can be incorporated in the code without affecting the execution plan or performance.

    Taking the Digits Only function as an example, I have added suitable ORDER BY clauses below:


    CREATE OR ALTER FUNCTION dbo.DigitsOnlyEE
    (
      @pString varchar(8000)
    ) 
    RETURNS table
    WITH SCHEMABINDING 
    AS 
    RETURN
     WITH 
      E1(N) AS 
      (
       SELECT N 
       FROM 
       (
        VALUES (NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL)
       ) AS X(N)
      ),
      Tally(N) AS 
      (
       SELECT TOP (CONVERT(bigint, LEN(@pString))) 
            rn = ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) 
       FROM E1 AS a
       CROSS JOIN E1 AS b
       CROSS JOIN E1 AS c
       CROSS JOIN E1 AS d 
       ORDER BY rn
      ) 
      SELECT 
       DigitsOnly = 
       (
        SELECT 
          SUBSTRING(@pString,T.N,1)
        FROM Tally AS T
        WHERE 
          ((ASCII(SUBSTRING(@pString,T.N,1)) - 48) & 0x7FFF) < 10
        ORDER BY 
          T.N
        FOR XML PATH('')
       );

  • Paul White - Tuesday, August 29, 2017 11:06 AM

    Contributing a couple of observations:

    To guarantee correct concatenation order in the FOR XML PATH, an ORDER BY is required (credit: Mikael Eriksson).

    Equally, the TOP in the Tally CTE technically needs an ORDER BY as well. Without that, SQL Server would be quite within its rights to return any numbered rows from the set, in any order.

    Both correctness enhancements can be incorporated in the code without affecting the execution plan or performance.

    Taking the Digits Only function as an example, I have added suitable ORDER BY clauses below:


    CREATE OR ALTER FUNCTION dbo.DigitsOnlyEE
    (
      @pString varchar(8000)
    ) 
    RETURNS table
    WITH SCHEMABINDING 
    AS 
    RETURN
     WITH 
      E1(N) AS 
      (
       SELECT N 
       FROM 
       (
        VALUES (NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL)
       ) AS X(N)
      ),
      Tally(N) AS 
      (
       SELECT TOP (CONVERT(bigint, LEN(@pString))) 
            rn = ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) 
       FROM E1 AS a
       CROSS JOIN E1 AS b
       CROSS JOIN E1 AS c
       CROSS JOIN E1 AS d 
       ORDER BY rn
      ) 
      SELECT 
       DigitsOnly = 
       (
        SELECT 
          SUBSTRING(@pString,T.N,1)
        FROM Tally AS T
        WHERE 
          ((ASCII(SUBSTRING(@pString,T.N,1)) - 48) & 0x7FFF) < 10
        ORDER BY 
          T.N
        FOR XML PATH('')
       );

    Good points Paul, thanks for the heads up!
    😎

  • Eirikur Eiriksson - Tuesday, August 29, 2017 12:59 PM

    Paul White - Tuesday, August 29, 2017 11:06 AM

    Contributing a couple of observations:

    To guarantee correct concatenation order in the FOR XML PATH, an ORDER BY is required (credit: Mikael Eriksson).

    Equally, the TOP in the Tally CTE technically needs an ORDER BY as well. Without that, SQL Server would be quite within its rights to return any numbered rows from the set, in any order.

    Both correctness enhancements can be incorporated in the code without affecting the execution plan or performance.

    Taking the Digits Only function as an example, I have added suitable ORDER BY clauses below:


    CREATE OR ALTER FUNCTION dbo.DigitsOnlyEE
    (
      @pString varchar(8000)
    ) 
    RETURNS table
    WITH SCHEMABINDING 
    AS 
    RETURN
     WITH 
      E1(N) AS 
      (
       SELECT N 
       FROM 
       (
        VALUES (NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL)
       ) AS X(N)
      ),
      Tally(N) AS 
      (
       SELECT TOP (CONVERT(bigint, LEN(@pString))) 
            rn = ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) 
       FROM E1 AS a
       CROSS JOIN E1 AS b
       CROSS JOIN E1 AS c
       CROSS JOIN E1 AS d 
       ORDER BY rn
      ) 
      SELECT 
       DigitsOnly = 
       (
        SELECT 
          SUBSTRING(@pString,T.N,1)
        FROM Tally AS T
        WHERE 
          ((ASCII(SUBSTRING(@pString,T.N,1)) - 48) & 0x7FFF) < 10
        ORDER BY 
          T.N
        FOR XML PATH('')
       );

    Good points Paul, thanks for the heads up!
    😎

    I'm pretty sure that's not necessary in this case because of the original ORDER BY (strange as it seems).  If possible, I'd like to see someone come up with a case where the original ORDER BY can have the screws put to it externally even if intentionally.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Viewing 15 posts - 31 through 45 (of 46 total)

    You must be logged in to reply to this topic. Login to reply