Tally OH! An Improved SQL 8K “CSV Splitter” Function

  • Ed Wagner (6/22/2016)


    David Moutray (6/22/2016)


    Classic article, Jeff. I have loved it (and used it extensively) since it first came out five years ago (and the discussion is still going strong!)

    With SQL 2016 just released this month, we now have the new STRING_SPLIT() function. How does that stack up against the various string splitters you have tested?

    Thank you, again, for your many contributions to our community. 🙂

    Wayne Sheffield wrote an article on just that topic: http://www.sqlservercentral.com/articles/STRING_SPLIT/139338/.

    Note that the new function doesn't return (or even guarantee) ordinal position. Without it, the new function is s a bit limited.

    Rumor has it that they're changing the function to return the ordinal position of the split-out elements. Not sure when, though.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • carl.thompson (11/22/2016)


    Hi,

    Thanks for the article Jeff.

    Would anyone be able to explain the relevance and importance of using 'WITH SCHEMA BINDING' with this function please?

    Thanks,

    Thanks for the feedback, Carl. Looks like a lot of good folks jumped in on the WITH SCHEMA_BINDING thing. You all set on that?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hi Jeff,

    Yes thanks. I had somewhat of an understanding of Schema Binding although I hadn't used it much myself. That's why it confused me a little within the split function. However, it has been explained that it is there to eek out that little bit extra performance.

    Thanks for the reply.

    Carl

  • Jeff Moden (11/22/2016)


    Ed Wagner (11/22/2016)


    carl.thompson (11/22/2016)


    Hi SSCrazyEights,

    Thank you for your reply.

    That makes perfect sense and I think confirmed what we thought to be the reason for the inclusion in the function. Can you confirm that this has been used in the function for the purpose of speed alone? As the function does not appear to be referencing any schema bound objects?

    Our infrastructure team have advised that a system upgrade failed due to the split function and a view both using WITH SCHEMA BINDING. Other for than performance improvement we advised that the SCHEMA BINDING option was not required within the split function. Have we provided correct advice?

    Thanks in advance for any reply.

    Carl

    The schemabinding is always optional. Specifying it means that the query optimizer knows that it doesn't have to check anything, so it saves a step at run-time. This is where the performance improvement comes from.

    The other advantage it can provide is safety. Let's say you have a function that references a table and relies on that table to do what it needs to do. Someone comes along and, not knowing about your function, alters the table in a way that breaks your function. With schemabinding specified on the function, the table can't be altered. The function has to be altered, then the table altered, then the function altered again. In short, it saves you from yourself...and others.

    I must admit I'm curious why your infrastructure team said that a system upgrade failed due to a schemabinding option being present on a view and function. If you're talking about an upgrade to an application, I can see it if the upgrade altered a base object, but I would think this would have been discovered during development or testing. I can't see it for an upgrade of the system - either OS or database version.

    Ah... you said "always". 😉 Just for those that might run into it, it's not optional if you're using it as a source for a persisted computed column.

    Touche, sir. 😉

  • Hi Jeff

    Thanks to you and everyone that has contributed to this article. It is very helpful!

    A very, very minor nitpick in your comments please:
    /*
    --10E+1 or 10 rows
    --10E+2 or 100 rows
    --10E+4 or 10,000 rows max
    */

    Those numbers should be:
    /*
    --10E+0 or 10 rows
    --10E+1 or 100 rows
    --10E+3 or 10,000 rows max
    */

    E+0:   https://www.google.co.za/search?q=%3D10E%2B0
    E+1:   https://www.google.co.za/search?q=%3D10E%2B1
    E+3:   https://www.google.co.za/search?q=%3D10E%2B3

  • Ah... you're absolutely correct.  I should have listed it as just En or the more formal 1En or the spreadsheet notation of 10^n.  Thank you.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I think it really depends on what you read.  I have also seen posts where 10e1 = 10.  Same search page that the links above point to.

  • Lynn Pettis - Tuesday, March 14, 2017 11:48 AM

    I think it really depends on what you read.  I have also seen posts where 10e1 = 10.  Same search page that the links above point to.

    We should ask SQL Server what it thinks.

    select 10E1 [10E1], 10E2 [10E2], 10E3 [10E3]

    10E1 10E2 10E3
    100 1000 10000

  • m.t.cleary - Tuesday, March 14, 2017 9:21 PM

    Lynn Pettis - Tuesday, March 14, 2017 11:48 AM

    I think it really depends on what you read.  I have also seen posts where 10e1 = 10.  Same search page that the links above point to.

    We should ask SQL Server what it thinks.

    select 10E1 [10E1], 10E2 [10E2], 10E3 [10E3]

    10E1 10E2 10E3
    100 1000 10000

    Quick note,
    "e" = "10^" , that is 10e1 != 10^1 but rather 10 x 10^1.
    😎

    10E1 = 10 x 10^1 = 100
    10E2 = 10 x 10^2 = 1000
    10E3 = 10 x 10^3 = 10000

  • Hello:

         Is anyone getting the below error when running the tests; I'll have to admit my SQL is not the strongest out there and I do not fully understand your code; but I need something to break apart a CSV string SSRS is passing to my Stored Procedure when selecting multiple parameters from a multi-parameter select list. 
         I am running MS SQL 2008 R2.
         When I click on the below error it brings me to the last line in your code:
                              CROSS APPLY dbo.DelimitedSplit8K(test.SomeValue,',') split

    Error Message

    Msg 102, Level 15, State 1, Line 38

    Incorrect syntax near '.'.

    Thank You,
    Rich

  • rinzana - Wednesday, September 6, 2017 3:20 PM

    Hello:

         Is anyone getting the below error when running the tests; I'll have to admit my SQL is not the strongest out there and I do not fully understand your code; but I need something to break apart a CSV string SSRS is passing to my Stored Procedure when selecting multiple parameters from a multi-parameter select list. 
         I am running MS SQL 2008 R2.
         When I click on the below error it brings me to the last line in your code:
                              CROSS APPLY dbo.DelimitedSplit8K(test.SomeValue,',') split

    Error Message

    Msg 102, Level 15, State 1, Line 38

    Incorrect syntax near '.'.

    Thank You,
    Rich

    Hi Rich.  First of all, welcome to SSC.

    Do you have the table test.SomeValue and is it visible to the SQL statement as it's running?  I don't know SSRS, but if it's passing a value as a parameter to your stored procedure, I'd expect to see something like this, where @ValueList is the parameter being passed.

    CROSS APPLY dbo.DelimitedSplit8K(@ValueList, ',') split

    If this isn't the problem, could you please post the entire SQL statement that's throwing the error?

  • Ed Wagner - Wednesday, September 6, 2017 3:28 PM

    rinzana - Wednesday, September 6, 2017 3:20 PM

    Hello:

         Is anyone getting the below error when running the tests; I'll have to admit my SQL is not the strongest out there and I do not fully understand your code; but I need something to break apart a CSV string SSRS is passing to my Stored Procedure when selecting multiple parameters from a multi-parameter select list. 
         I am running MS SQL 2008 R2.
         When I click on the below error it brings me to the last line in your code:
                              CROSS APPLY dbo.DelimitedSplit8K(test.SomeValue,',') split

    Error Message

    Msg 102, Level 15, State 1, Line 38

    Incorrect syntax near '.'.

    Thank You,
    Rich

    Hi Rich.  First of all, welcome to SSC.

    Do you have the table test.SomeValue and is it visible to the SQL statement as it's running?  I don't know SSRS, but if it's passing a value as a parameter to your stored procedure, I'd expect to see something like this, where @ValueList is the parameter being passed.

    CROSS APPLY dbo.DelimitedSplit8K@ValueList, ',') split

    If this isn't the problem, could you please post the entire SQL statement that's throwing the error?

    Ed You forgot the parenthesis in your example which made me double take a couple times. I think you mean something like this.

    CROSS APPLY dbo.DelimitedSplit8K(@ValueList, ',') split

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Sean Lange - Wednesday, September 6, 2017 3:58 PM

    Ed Wagner - Wednesday, September 6, 2017 3:28 PM

    rinzana - Wednesday, September 6, 2017 3:20 PM

    Hello:

         Is anyone getting the below error when running the tests; I'll have to admit my SQL is not the strongest out there and I do not fully understand your code; but I need something to break apart a CSV string SSRS is passing to my Stored Procedure when selecting multiple parameters from a multi-parameter select list. 
         I am running MS SQL 2008 R2.
         When I click on the below error it brings me to the last line in your code:
                              CROSS APPLY dbo.DelimitedSplit8K(test.SomeValue,',') split

    Error Message

    Msg 102, Level 15, State 1, Line 38

    Incorrect syntax near '.'.

    Thank You,
    Rich

    Hi Rich.  First of all, welcome to SSC.

    Do you have the table test.SomeValue and is it visible to the SQL statement as it's running?  I don't know SSRS, but if it's passing a value as a parameter to your stored procedure, I'd expect to see something like this, where @ValueList is the parameter being passed.

    CROSS APPLY dbo.DelimitedSplit8K@ValueList, ',') split

    If this isn't the problem, could you please post the entire SQL statement that's throwing the error?

    Ed You forgot the parenthesis in your example which made me double take a couple times. I think you mean something like this.

    CROSS APPLY dbo.DelimitedSplit8K(@ValueList, ',') split

    DOH! Well, that was a stupid mistake.  Thanks for catching it for me.  I'll fix my post.

  • Thank you for you quick replies:
    I am using the Test 1 script provided 

    -- TEST 1:
    -- This tests for various possible conditions in a string using a comma as the delimiter. The expected results are
    -- laid out in the comments
    --=====================================================================================================================
    --===== Conditionally drop the test tables to make reruns easier for testing.
      -- (this is NOT a part of the solution)
      IF OBJECT_ID('tempdb..#JBMTest') IS NOT NULL DROP TABLE #JBMTest
    ;
    --===== Create and populate a test table on the fly (this is NOT a part of the solution).
      -- In the following comments, "b" is a blank and "E" is an element in the left to right order.
      -- Double Quotes are used to encapsulate the output of "Item" so that you can see that all blanks
      -- are preserved no matter where they may appear.
    SELECT *
     INTO #JBMTest
     FROM (                --# & type of Return Row(s)
       SELECT 0, NULL        UNION ALL --1 NULL
       SELECT 1, SPACE(0)      UNION ALL --1 b (Empty String)
       SELECT 2, SPACE(1)      UNION ALL --1 b (1 space)
       SELECT 3, SPACE(5)      UNION ALL --1 b (5 spaces)
       SELECT 4, ','        UNION ALL --2 b b (both are empty strings)
       SELECT 5, '55555'       UNION ALL --1 E
       SELECT 6, ',55555'      UNION ALL --2 b E
       SELECT 7, ',55555,'      UNION ALL --3 b E b
       SELECT 8, '55555,'      UNION ALL --2 b B
       SELECT 9, '55555,1'      UNION ALL --2 E E
       SELECT 10, '1,55555'      UNION ALL --2 E E
       SELECT 11, '55555,4444,333,22,1'  UNION ALL --5 E E E E E
       SELECT 12, '55555,4444,,333,22,1'  UNION ALL --6 E E b E E E
       SELECT 13, ',55555,4444,,333,22,1,' UNION ALL --8 b E E b E E E b
       SELECT 14, ',55555,4444,,,333,22,1,' UNION ALL --9 b E E b b E E E b
       SELECT 15, ' 4444,55555 '    UNION ALL --2 E (w/Leading Space) E (w/Trailing Space)
       SELECT 16, 'This,is,a,test.'       --E E E E
       ) d (SomeID, SomeValue)
    ;
    --===== Split the CSV column for the whole table using CROSS APPLY (this is the solution)
    SELECT test.SomeID, test.SomeValue, split.ItemNumber, Item = QUOTENAME(split.Item,'"')
     FROM #JBMTest test
    CROSS APPLY dbo.DelimitedSplit8K (test.SomeValue, ',') split
    ;

    Rich

  • Thank you for you quick replies:
    I am using the Test 1 script provided 

    -- TEST 1:
    -- This tests for various possible conditions in a string using a comma as the delimiter. The expected results are
    -- laid out in the comments
    --=====================================================================================================================
    --===== Conditionally drop the test tables to make reruns easier for testing.
      -- (this is NOT a part of the solution)
      IF OBJECT_ID('tempdb..#JBMTest') IS NOT NULL DROP TABLE #JBMTest
    ;
    --===== Create and populate a test table on the fly (this is NOT a part of the solution).
      -- In the following comments, "b" is a blank and "E" is an element in the left to right order.
      -- Double Quotes are used to encapsulate the output of "Item" so that you can see that all blanks
      -- are preserved no matter where they may appear.
    SELECT *
     INTO #JBMTest
     FROM (                --# & type of Return Row(s)
       SELECT 0, NULL        UNION ALL --1 NULL
       SELECT 1, SPACE(0)      UNION ALL --1 b (Empty String)
       SELECT 2, SPACE(1)      UNION ALL --1 b (1 space)
       SELECT 3, SPACE(5)      UNION ALL --1 b (5 spaces)
       SELECT 4, ','        UNION ALL --2 b b (both are empty strings)
       SELECT 5, '55555'       UNION ALL --1 E
       SELECT 6, ',55555'      UNION ALL --2 b E
       SELECT 7, ',55555,'      UNION ALL --3 b E b
       SELECT 8, '55555,'      UNION ALL --2 b B
       SELECT 9, '55555,1'      UNION ALL --2 E E
       SELECT 10, '1,55555'      UNION ALL --2 E E
       SELECT 11, '55555,4444,333,22,1'  UNION ALL --5 E E E E E
       SELECT 12, '55555,4444,,333,22,1'  UNION ALL --6 E E b E E E
       SELECT 13, ',55555,4444,,333,22,1,' UNION ALL --8 b E E b E E E b
       SELECT 14, ',55555,4444,,,333,22,1,' UNION ALL --9 b E E b b E E E b
       SELECT 15, ' 4444,55555 '    UNION ALL --2 E (w/Leading Space) E (w/Trailing Space)
       SELECT 16, 'This,is,a,test.'       --E E E E
       ) d (SomeID, SomeValue)
    ;
    --===== Split the CSV column for the whole table using CROSS APPLY (this is the solution)
    SELECT test.SomeID, test.SomeValue, split.ItemNumber, Item = QUOTENAME(split.Item,'"')
     FROM #JBMTest test
    CROSS APPLY dbo.DelimitedSplit8K (test.SomeValue, ',') split
    ;

    Rich

Viewing 15 posts - 916 through 930 (of 990 total)

You must be logged in to reply to this topic. Login to reply