STRING_SPLIT with no delimiter

  • Comments posted to this topic are about the item STRING_SPLIT with no delimiter

  • Awesome thinking outside the box!  Well done!

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Really interesting question, thanks.
    Will  definitely be investigating how to use this with some of the delimited data sets we import

    ____________________________________________
    Space, the final frontier? not any more...
    All limits henceforth are self-imposed.
    β€œlibera tute vulgaris ex”

  • Good to know!
    I would add  a where clause  to avoid empty string
    DECLARE @a VARCHAR(6) = 'ABCDEF'
    SELECT value FROM STRING_SPLIT(CAST(CAST(CAST(@A AS NVARCHAR) AS VARBINARY) AS VARCHAR), CHAR(0))
    WHERE value <>''

  • Really interesting Q with a very instructive A - thank you!

  • I must admit I wouldn't have thought of doing that!

  • WOW !!
    Awesome question and great explanation, BUT πŸ™‚
    "3. CAST(... AS VARCHAR) converts the binary value, byte-by-byte into a new string. If you select this value in SQL Server, it will return 'A', but it actually holds 'A{0}B{0}C{0}D{0}E{0}F{0}' where {0} represents the 0 byte values from the varbinary."
    How would I know this without being told, if I can't see it ?

  • Budd - Monday, December 11, 2017 6:39 AM

    WOW !!
    Awesome question and great explanation, BUT πŸ™‚
    "3. CAST(... AS VARCHAR) converts the binary value, byte-by-byte into a new string. If you select this value in SQL Server, it will return 'A', but it actually holds 'A{0}B{0}C{0}D{0}E{0}F{0}' where {0} represents the 0 byte values from the varbinary."
    How would I know this without being told, if I can't see it ?

    You have to know that NVARCHAR is two bytes, one of which identifies (for lack of a better term) the language being used.  For the U.S. and many English speaking countries, that "language byte" is "00".  The way to see it is to convert it to varbinary and realize that each character for NVARCHAR is made up of two bytes each.  You also have to know the "ASCII" table (AsciiTable.com) so that you understand that the capital letter "A" is numerically a decimal value of "65" and understand that you need to convert (or look it up at the AsciiTable.com site) that to hexadecimal "41" to find the "A's" in the string of bytes.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Pretty slick! It took me a moment with the multiple casts but that was a slick trick.:smooooth:

  • It seems to me the answer to this question is more about string manipulation through data type conversion than it is about String_Split function.  And by altering the source string to fit the necessary parameters of the function, the original question has now been altered.  By needing to have a deep understanding of the data type storage properties to get the correct answer, then this question is about that knowledge, not about understanding the String_Split function. The correct answer should have been "you cannot do this with String_Split() without first altering the string to include a separator".

  • Jeff Moden - Monday, December 11, 2017 6:52 AM

    Budd - Monday, December 11, 2017 6:39 AM

    WOW !!
    Awesome question and great explanation, BUT πŸ™‚
    "3. CAST(... AS VARCHAR) converts the binary value, byte-by-byte into a new string. If you select this value in SQL Server, it will return 'A', but it actually holds 'A{0}B{0}C{0}D{0}E{0}F{0}' where {0} represents the 0 byte values from the varbinary."
    How would I know this without being told, if I can't see it ?

    You have to know that NVARCHAR is two bytes, one of which identifies (for lack of a better term) the language being used.  For the U.S. and many English speaking countries, that "language byte" is "00".  The way to see it is to convert it to varbinary and realize that each character for NVARCHAR is made up of two bytes each.  You also have to know the "ASCII" table (AsciiTable.com) so that you understand that the capital letter "A" is numerically a decimal value of "65" and understand that you need to convert (or look it up at the AsciiTable.com site) that to hexadecimal "41" to find the "A's" in the string of bytes.

    One way to visualize this is with the following code:

    DECLARE @a VARCHAR(6) = 'ABCDEF'
    SELECT CAST(CAST(@A AS NVARCHAR) AS VARBINARY)
    SELECT CAST(@A AS VARBINARY)

    The first SELECT shows the language bytes in the string. The second SELECT shows the same string without the language bytes.
    Hope this helps.

  • bknight 46549 - Monday, December 11, 2017 7:29 AM

    It seems to me the answer to this question is more about string manipulation through data type conversion than it is about String_Split function.  And by altering the source string to fit the necessary parameters of the function, the original question has now been altered.  By needing to have a deep understanding of the data type storage properties to get the correct answer, then this question is about that knowledge, not about understanding the String_Split function. The correct answer should have been "you cannot do this with String_Split() without first altering the string to include a separator".

    Please don't take personal offense but my opinion differs and I disagree with that.  The question was "How can I do this using the STRING_SPLIT() function in SQL 2016+?".  One of the answers clearly demonstrates that you can do it using the function even if it does entail a bit of extra knowledge and a little SQL prestidigitation.  Answer D simply states that it couldn't be done with the function, period, which is clearly untrue.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • bknight 46549 - Monday, December 11, 2017 7:29 AM

    It seems to me the answer to this question is more about string manipulation through data type conversion than it is about String_Split function.  And by altering the source string to fit the necessary parameters of the function, the original question has now been altered.  By needing to have a deep understanding of the data type storage properties to get the correct answer, then this question is about that knowledge, not about understanding the String_Split function. The correct answer should have been "you cannot do this with String_Split() without first altering the string to include a separator".

    I see where you're coming from -- you can't call STRING_SPLIT() directly on the original string to accomplish this, but that's only half of the point of this question. The other half is that there's often a way to accomplish a task which isn't directly supported, simply by changing the way you look at the data or the problem.

    Yes, that does require some understanding of data types and how they are formatted. But when you're working with data at production scale, you should have that understanding, or the ability to research it quickly, anyway.

    I always try to format my questions so that you can easily run the options in a test instance if you're not sure about the more technical aspects -- I don't consider these questions an exam on anyone's actual skill level, but an opportunity to share and learn.

  • I tried the answer on SQL Server 2016, the result is only one A?

    THX

  • werner.grimmer - Monday, December 11, 2017 10:25 AM

    I tried the answer on SQL Server 2016, the result is only one A?

    THX

    There's something wrong on your box or maybe just different... I tried it on 3 different 2016 boxes and it works just fine.  What is your default language for the server?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 15 (of 37 total)

You must be logged in to reply to this topic. Login to reply