March 19, 2019 at 7:26 am
Eirikur Eiriksson - Tuesday, March 19, 2019 7:18 AMQuick question, are there any multi-line entries or can one use the new line bytes as a row delimiter?
😎
To whom are you directing that question?
--Jeff Moden
March 19, 2019 at 8:52 am
Jeff Moden - Tuesday, March 19, 2019 7:26 AMEirikur Eiriksson - Tuesday, March 19, 2019 7:18 AMQuick question, are there any multi-line entries or can one use the new line bytes as a row delimiter?
😎To whom are you directing that question?
The OP
😎
The spanner in the works is the multi row entries, big difference from handling single row entries 😉
March 20, 2019 at 5:33 am
Eirikur Eiriksson - Tuesday, March 19, 2019 8:52 AMJeff Moden - Tuesday, March 19, 2019 7:26 AMEirikur Eiriksson - Tuesday, March 19, 2019 7:18 AMQuick question, are there any multi-line entries or can one use the new line bytes as a row delimiter?
😎To whom are you directing that question?
The OP
😎The spanner in the works is the multi row entries, big difference from handling single row entries 😉
Agreed but not much of one if each "record" has the same pattern of multi-line entries. I do use format files for such a thing if the multi-line entries are consistent.
--Jeff Moden
January 8, 2020 at 12:54 am
Hey, Jeff, will you update this yet again to include the new split_string() function in SQL Server 2016+?
January 8, 2020 at 3:33 am
Hey, Jeff, will you update this yet again to include the new split_string() function in SQL Server 2016+?
Probably not. The STRING_SPLIT function is relatively crippled for those that need to know the position of the split out elements. That's been pretty well covered by Wayne Sheffield when it first came out. Here's the link...
https://www.sqlservercentral.com/articles/splitting-strings-in-sql-server-2016
--Jeff Moden
December 25, 2020 at 5:23 pm
corey lawson wrote:Hey, Jeff, will you update this yet again to include the new split_string() function in SQL Server 2016+?
Probably not. The STRING_SPLIT function is relatively crippled for those that need to know the position of the split out elements. That's been pretty well covered by Wayne Sheffield when it first came out. Here's the link...
https://www.sqlservercentral.com/articles/splitting-strings-in-sql-server-2016
Long story short, here's a schemabound ordinal splitter tvf which uses the string_split function as its base and then uses charindex with the "double separator append" trick. Maybe this is an efficient approach?
drop function if exists dbo.string_split_kvp;
go
create function dbo.string_split_kvp(
@string varchar(8000),
@separator char(1))
returns table with schemabinding as return
select row_number() over (order by
charindex(concat(@separator, [value], @separator),
concat(@separator, @string, @separator))) ,
[value]
from string_split(@string, @separator);
Example
select * from dbo.string_split_kvp('this is a space separated string', ' ');
Output
key value
1 this
2 is
3 a
4 space
5 separated
6 string
Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können
December 26, 2020 at 12:32 am
Jeff Moden wrote:corey lawson wrote:Hey, Jeff, will you update this yet again to include the new split_string() function in SQL Server 2016+?
Probably not. The STRING_SPLIT function is relatively crippled for those that need to know the position of the split out elements. That's been pretty well covered by Wayne Sheffield when it first came out. Here's the link...
https://www.sqlservercentral.com/articles/splitting-strings-in-sql-server-2016
Long story short, here's a schemabound ordinal splitter tvf which uses the string_split function as its base and then uses charindex with the "double separator append" trick. Maybe this is an efficient approach?
drop function if exists dbo.string_split_kvp;
go
create function dbo.string_split_kvp(
@string varchar(8000),
@separator char(1))
returns table with schemabinding as return
select row_number() over (order by
charindex(concat(@separator, [value], @separator),
concat(@separator, @string, @separator))) ,
[value]
from string_split(@string, @separator);Example
select * from dbo.string_split_kvp('this is a space separated string', ' ');Output
key value
1 this
2 is
3 a
4 space
5 separated
6 string
apart from the performance that is likely to be quite bad it has errors on its construct.
try it with
select * from dbo.string_split_kvp('this has 2 repeated has', ' ');
December 26, 2020 at 1:21 am
apart from the performance that is likely to be quite bad has errors on its construct
The Construct is a problem but I suspect the performance should be fine, STRING_SPLIT is quite fast. The logic will force a sort but you will only sort up to 8001 rows which is quite low.
That said, the logic is broke. STRING_SPLIT is great when you don't need the items in a specific order. I use it for things like parsing multi-select strings in SSRS. If order is important then STRING_SPLIT is not the solution.
-- Itzik Ben-Gan 2001
December 26, 2020 at 2:47 am
Yes the logic is broke. I agree with all of Alan's comments. It seems possible to take corrective steps using cte(s) and additional windowing functions where the cardinality of the set(s) is the word count and not the character count. Maybe that doesn't get any performance gains though. I'll keep trying things. This issue of ordering split strings seems to come up over and over.
Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können
December 26, 2020 at 3:17 am
December 26, 2020 at 1:15 pm
How about "This string is a space separated string"?
will this function work?
By "work" if you mean "give the right answer" then no. It doesn't return any error though. But I do have another idea to try. Once I'm back at home I'll test it out.
Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können
December 27, 2020 at 5:18 pm
@ Steve Collins,
Great try/good thinking. I'm thinking, though, that just about anything you add to it is going to slow it way down... especially since, if you read the article, the original performance problem was concatenation.
@ Alan,
I'm honestly a little bit surprised that you're justifying a sort because it has a low row count. We know how that usually works out.
@Everyone,
The function is this article served well for a long time but it has soundly been beaten for performance after 2012 came out by Eirikur Eiriksson. His good code still avoids concatenation and his change was super simple... Basically, he replaced CHAR index with LEAD. Prior to him posting his solution, he asked me to test it for him and I tested for accurate functionality and performance... It IS ACCURATE and it IS TWICE AS FAST as the original (getting real close to CLR speeds).
With that I'll say that if you're using 2012 or above, I strongly recommend you upgrade your function to his code, with due credit, of course.
Here's the link to Eirikur's article (the code is in Part 1 of his fine article).
https://www.sqlservercentral.com/articles/reaping-the-benefits-of-the-window-functions-in-t-sql-2
If you want to performance test against something, you should performance test against Eirikur's code and I should have Steve Jones update this article to point to Eirikur's.
--Jeff Moden
December 27, 2020 at 8:17 pm
@ Alan,
I'm honestly a little bit surprised that you're justifying a sort because it has a low row count. We know how that usually works out.
I was a little rushed in my response. I'll clarify when I get time 😉
-- Itzik Ben-Gan 2001
Viewing 13 posts - 946 through 958 (of 958 total)
You must be logged in to reply to this topic. Login to reply