STRING_SPLIT with Index as Table-Valued Function

  • This is a pretty common "fix" for this problem but, Microsoft makes no guarantees that such a thing will actually work forever.  Knowing how most managed code would work to do such a thing would indicate that it should work all the time, but I wouldn't be my life on it.  My recommendation would be to use something that is guaranteed to return the correct elemental order index, even if it's a bit slower or more limited in what it can be fed.

    Shifting gears a bit, I don't care too much for the function for another reason... it won't return a NULL if given a NULL.  It returns nothing, instead.

    I just can't imagine what the designers of this function where thinking considering the two major (IMHO) problems it has.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Comments posted to this topic are about the item STRING_SPLIT with Index as Table-Valued Function

  • Fair points. I've updated the function to return a "0, No value" record if a NULL value is passed to the function. An empty string within the function returns an empty string, but I'm OK with that. While "correct elemental order" accuracy may be an issue, we've had success with this returning results as expected, but we'll keep an eye on it. The issue we faced was with data that had serious data entry / governance issues and the length of the strings / number of segments were a irregular as the segments themselves, so any traditional "looping" solutions were prone to more issues than this function would present.

  • bteague wrote:

    Fair points. I've updated the function to return a "0, No value" record if a NULL value is passed to the function. An empty string within the function returns an empty string, but I'm OK with that. While "correct elemental order" accuracy may be an issue, we've had success with this returning results as expected, but we'll keep an eye on it. The issue we faced was with data that had serious data entry / governance issues and the length of the strings / number of segments were a irregular as the segments themselves, so any traditional "looping" solutions were prone to more issues than this function would present.

    Did the maximum number of bytes of such strings exceed 8000 bytes?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • No. Not even close. Usually less than 50 characters.

  • disclaimer, I am not arguing.

    _however_

    You really can't blame the original developers, everybody always talks about how relational databases have no implicit order and its very understandable that, given this philosophy, the developers of this function string_split therefore did not consider giving the ordinal position of each returned split value.

    Heck, I didn't get my way either, I want a splitter that changes the split character based on results earlier on in the string AND also returns the character doing the split. Fortunately in my case, I can always pull out my punched cards....

     

     

  • Absolutely and all fair points, but until SQL Server finally supports the "occurence" argument as Oracle has for a while in its INSTR function, I'll use this as a "good enough" solution when appropriate.

  • bteague wrote:

    Absolutely and all fair points, but until SQL Server finally supports the "occurence" argument as Oracle has for a while in its INSTR function, I'll use this as a "good enough" solution when appropriate.

    It's never appropriate because, in this case and especially since you're not breaking the 8K byte barrier, there are well known methods to do it right all the time and with some damned good performance, as well. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply