I 100% agree with you that his test script wasn't the best. His number table splitter you've already dressed down, and that xml splitter doesn't do any lob caching techniques and forces it to recreate the xml thousands of times over.
I installed a copy of 2016 RC1 on my laptop. This is a fresh copy and this laptop has never had sql installed on it before. Specs are as such:
Operating System: Windows 10 Pro 64-bit (10.0, Build 10586) (10586.th2_release_sec.160223-1728)
Processor: Intel(R) Core(TM) i5-5300U CPU @ 2.30GHz (4 CPUs), ~2.3GHz
Memory: 8192MB RAM
This machine also shows that issue with the traceflag causing a drastic shift in performance FYI.
I added the following function to my test script to make it more inline with the other funcs in the race:
if object_id('stringsplit') is not null drop function dbo.stringsplit
create function dbo.stringsplit(@pstring varchar(8000),@delimiter char(1))
with schemabinding as
select itemNumber = ROW_NUMBER() over (order by (select 1)),
item = value
I realize this isn't perfect as there is no mention of an ordering guarantee.
I added it to my test script. As I suspected, stringsplit performs wells
I'm kind of shocked that my hybrid split approach actually beats the built-in. Admittedly though, stringsplit is a polymorphic function though which gives it an advantage over my hybrid split approach which can only handle varchar and those under 8k.
For a final test, I modified the test script once more. I didn't capture the item ordinal, and renamed the expected output column in my test script to be value and tried to capture the effect of how much the wrapper iTVF had on the thing. I got the following result:
Now it barely eeks a victory from fn_split. For some reason the wrapper function adds a pretty decent overhead to string_split, and I'm not sure why. It could be the coercing of the argument to varchar(8000). Also if I make the wrapper a varchar(max) the performance is bad (roughly 3x) but still faster than delimitedsplit.
The built-in once it adds item ordinal, should be fast enough and while custom clr based approach can always be a bit faster it's marginal improvement at the most.
Full gist with splitter source code and test script
and if you just want the final version of the test
2016 version of script