﻿<?xml version='1.0' encoding='UTF-8'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/"><channel><title>SQLServerCentral / SQL Server 2005 / T-SQL (SS2K5)  / Performance issue with tally solution / Latest Posts</title><generator>InstantForum.NET v2.9.0</generator><description>SQLServerCentral</description><link>http://www.sqlservercentral.com/Forums/</link><webMaster>notifications@sqlservercentral.com</webMaster><lastBuildDate>Sun, 19 May 2013 21:05:50 GMT</lastBuildDate><ttl>20</ttl><item><title>RE: Performance issue with tally solution</title><link>http://www.sqlservercentral.com/Forums/Topic695508-338-1.aspx</link><description>[quote][b]Adam Machanic (10/3/2012)[/b][hr]Where's Florian these days?[/quote]Haven't seen him for ages on SSC.  He is @Florian_Reischl on twitter, if that helps.[quote][url]https://connect.microsoft.com/SQLServer/feedback/details/765930/clr-tvfs-are-not-gracefully-handled-during-appdomain-recycles[/url][/quote]Voted.</description><pubDate>Wed, 03 Oct 2012 19:24:28 GMT</pubDate><dc:creator>Paul White</dc:creator></item><item><title>RE: Performance issue with tally solution</title><link>http://www.sqlservercentral.com/Forums/Topic695508-338-1.aspx</link><description>[quote][b]RBarryYoung (10/3/2012)[/b][hr]The problem is though, that this is how the data is being delivered to me, as a multi-million row pipe-delimited text file, downloaded regularly from the IRS website.  I import the data with BULK INSERT, but (to answer the obvious question), I cannot use BULK INSERT to do the splitting because some few rows at the beginning and the end are non-conformant (they are not real data rows).  And as we all know, BULK INSERT (and BCP) are not exactly tolerant of non-conforming input rows.  And I cannot exclude them reliably with BULK INSERT because we cannot know for sure ahead of time how many or which ones at the beginning and end will be non-conformant.  And for legal reasons, we have to be [i]sure[/i] that we have not lost any rows.[/quote]Have you thought about combining the CLR splitter with OPENROWSET...BULK to avoid the staging table?  That gives us a way to achieve minimally-logged inserts directly from the source file while transforming the data (= rejecting bad rows and splitting good ones) in the bulk flow.  I wrote a demo of the basic idea here: [url]http://www.sqlservercentral.com/Forums/FindPost916633.aspx[/url]</description><pubDate>Wed, 03 Oct 2012 19:22:45 GMT</pubDate><dc:creator>Paul White</dc:creator></item><item><title>RE: Performance issue with tally solution</title><link>http://www.sqlservercentral.com/Forums/Topic695508-338-1.aspx</link><description>[quote][b]RBarryYoung (10/3/2012)[/b][hr]I am using this one here: [url]http://www.sqlservercentral.com/Forums/FindPost1356773.aspx[/url], I had assumed that it was Paul's...[/quote]Yes that's mine.</description><pubDate>Wed, 03 Oct 2012 19:06:19 GMT</pubDate><dc:creator>Paul White</dc:creator></item><item><title>RE: Performance issue with tally solution</title><link>http://www.sqlservercentral.com/Forums/Topic695508-338-1.aspx</link><description>[quote][b]Adam Machanic (10/3/2012)[/b][hr][quote][b]RBarryYoung (10/3/2012)[/b][quote]So right now, I'm using Paul's latest CLR splitter...[/quote]You mean mine, that Paul posted the binaries for! :-)[/quote]I am using this one here: [url]http://www.sqlservercentral.com/Forums/FindPost1356773.aspx[/url], I had assumed that it was Paul's because he had not mentioned anyone else there and I had not seen you on that thread recently.  If this is fact your work Adam, then I sincerely apologize for not crediting you properly.</description><pubDate>Wed, 03 Oct 2012 18:12:54 GMT</pubDate><dc:creator>RBarryYoung</dc:creator></item><item><title>RE: Performance issue with tally solution</title><link>http://www.sqlservercentral.com/Forums/Topic695508-338-1.aspx</link><description>[quote][b]RBarryYoung (10/3/2012)[/b][hr]So the obvious solution is to BULK INSERT them to a staging table with one big VARCHAR() column and then split in inside of SQL where we can more readily detect and handle the exceptions.  But that's not [i]nearly[/i] as fast as BULK INSERT ...[/quote]The "obvious" solution to me would be to use an external ETL package (either SSIS, or a custom thing built in C# or Powershell), do the business logic and string splitting outside of SQL Server, then bulk in the result. But of course "obvious" is very, very context dependent and you probably haven't told the entire story.[quote]So right now, I'm using Paul's latest CLR splitter...[/quote]You mean mine, that Paul posted the binaries for! :-)</description><pubDate>Wed, 03 Oct 2012 12:24:54 GMT</pubDate><dc:creator>Adam Machanic</dc:creator></item><item><title>RE: Performance issue with tally solution</title><link>http://www.sqlservercentral.com/Forums/Topic695508-338-1.aspx</link><description>[quote][b]Jeff Moden (9/6/2012)[/b][hr][quote][b]UMG Developer (9/5/2012)[/b][hr][quote][b]Adam Machanic (9/5/2012)[/b][hr]How are you seeing string splits used where such a difference would matter? (Again, I am *only* talking about string split operations.)Generally when I see string splitters used it's for passing a set of arguments to a stored procedure, and it will end up in some query like:SELECT * FROM TableWHERE ID IN (SELECT CONVERT(INT, output) FROM dbo.StringSplitFunction(@values))[/quote]Adam,I have cases where I have a table with ~20 million rows and each row contains a string that needs to be split, so that is a case where the splitter would be called 20 million times in a single INSERT query, so any optimization to make it faster can make a big difference.[/quote]I have to agree with Adam on this one.  Why would you store such a volume of comma separated data in a database?  A normalized sister table would be much more efficient.[/quote]Actually, I am in exactly this situation right now, Jeff.  As I'm sure you know, I would never voluntarily store data like that, not permanently anyway.  The problem is though, that this is how the data is being delivered to me, as a multi-million row pipe-delimited text file, downloaded regularly from the IRS website.  I import the data with BULK INSERT, but (to answer the obvious question), I cannot use BULK INSERT to do the splitting because some few rows at the beginning and the end are non-conformant (they are not real data rows).  And as we all know, BULK INSERT (and BCP) are not exactly tolerant of non-conforming input rows.  And I cannot exclude them reliably with BULK INSERT because we cannot know for sure ahead of time how many or which ones at the beginning and end will be non-conformant.  And for legal reasons, we have to be [i]sure[/i] that we have not lost any rows.So the obvious solution is to BULK INSERT them to a staging table with one big VARCHAR() column and then split in inside of SQL where we can more readily detect and handle the exceptions.  But that's not [i]nearly[/i] as fast as BULK INSERT ...To import and split a million row file with BULK INSERT takes me about 10 seconds (when I know ahead of time, which rows are bad).  Just making a SELECT INTO copy of the imported table takes about 4 seconds.  But splitting it with the fastest TSQL splitters I can find take about 2-4 [b][i]minutes[/i][/b]!  And the rows aren't big, just 7 columns with about 100 characters.  So right now, I'm using Paul's latest CLR splitter which does pretty well (about 17 seconds).  That's going to cause some maintenance issues for this customer, but given the speed difference, they'll probably agree to it.</description><pubDate>Wed, 03 Oct 2012 12:08:50 GMT</pubDate><dc:creator>RBarryYoung</dc:creator></item><item><title>RE: Performance issue with tally solution</title><link>http://www.sqlservercentral.com/Forums/Topic695508-338-1.aspx</link><description>Hi all,No more comments on my newer CLR version? That's unfortunate. Where's Florian these days?Anyway, I just discovered what I consider to be a major issue with CLR TVF based solutions: they don't unload gracefully when an AppDomain recycle occurs. This means that your string split function can be merrily running along, splitting strings, and next thing you know you've hit a strange and misleading exception -- because some OTHER person's code failed.Bad state of affairs. And not the case with the other major types of CLR modules (stored procedures and scalar UDFs), so it seems like this should be able to be fixed. If you'd like to see that happen, please vote here:[url]https://connect.microsoft.com/SQLServer/feedback/details/765930/clr-tvfs-are-not-gracefully-handled-during-appdomain-recycles[/url]Thanks!Adam</description><pubDate>Wed, 03 Oct 2012 11:12:46 GMT</pubDate><dc:creator>Adam Machanic</dc:creator></item><item><title>RE: Performance issue with tally solution</title><link>http://www.sqlservercentral.com/Forums/Topic695508-338-1.aspx</link><description>[quote][b]Jeff Moden (9/6/2012)[/b][hr][quote][b]UMG Developer (9/5/2012)[/b][hr][quote][b]Adam Machanic (9/5/2012)[/b][hr]How are you seeing string splits used where such a difference would matter? (Again, I am *only* talking about string split operations.)Generally when I see string splitters used it's for passing a set of arguments to a stored procedure, and it will end up in some query like:SELECT * FROM TableWHERE ID IN (SELECT CONVERT(INT, output) FROM dbo.StringSplitFunction(@values))[/quote]Adam,I have cases where I have a table with ~20 million rows and each row contains a string that needs to be split, so that is a case where the splitter would be called 20 million times in a single INSERT query, so any optimization to make it faster can make a big difference.[/quote]I have to agree with Adam on this one.  Why would you store such a volume of comma separated data in a database?  A normalized sister table would be much more efficient.[/quote]I guess, it's not really about "why would you store such....", it's more to do with "if you have it (including may be some staging storage) and you need to transform it". The source data for transformation can come from very different sources including non-relational ones, so you may not be able to discuss the format of input feeds...</description><pubDate>Thu, 06 Sep 2012 07:36:20 GMT</pubDate><dc:creator>Eugene Elutin</dc:creator></item><item><title>RE: Performance issue with tally solution</title><link>http://www.sqlservercentral.com/Forums/Topic695508-338-1.aspx</link><description>[quote][b]UMG Developer (9/5/2012)[/b][hr][quote][b]Adam Machanic (9/5/2012)[/b][hr]How are you seeing string splits used where such a difference would matter? (Again, I am *only* talking about string split operations.)Generally when I see string splitters used it's for passing a set of arguments to a stored procedure, and it will end up in some query like:SELECT * FROM TableWHERE ID IN (SELECT CONVERT(INT, output) FROM dbo.StringSplitFunction(@values))[/quote]Adam,I have cases where I have a table with ~20 million rows and each row contains a string that needs to be split, so that is a case where the splitter would be called 20 million times in a single INSERT query, so any optimization to make it faster can make a big difference.[/quote]I have to agree with Adam on this one.  Why would you store such a volume of comma separated data in a database?  A normalized sister table would be much more efficient.</description><pubDate>Thu, 06 Sep 2012 07:20:04 GMT</pubDate><dc:creator>Jeff Moden</dc:creator></item><item><title>RE: Performance issue with tally solution</title><link>http://www.sqlservercentral.com/Forums/Topic695508-338-1.aspx</link><description>[quote][b]UMG Developer (9/5/2012)[/b][hr][quote]But I can split the split operation into two pieces, one that uses the &amp;gt;4k version and one that uses the &amp;lt;=4k version. I don't know how much time that would actually save me, and I don't have time to run trials right now.[/quote]Let's see, 9 microseconds per call vs. the overhead of scanning all of the data twice? Hm. Best of luck with that ;-)</description><pubDate>Wed, 05 Sep 2012 14:57:42 GMT</pubDate><dc:creator>Adam Machanic</dc:creator></item><item><title>RE: Performance issue with tally solution</title><link>http://www.sqlservercentral.com/Forums/Topic695508-338-1.aspx</link><description>[quote][b]Adam Machanic (9/5/2012)[/b][hr]Ah, and there is the rub: You're over the 4k threshold, which means that you're stuck with the larger definition or you're hoping that the query optimizer gives you proper startup filters (and bothers to honor them).[/quote]But I can split the split operation into two pieces, one that uses the &amp;gt;4k version and one that uses the &amp;lt;=4k version. I don't know how much time that would actually save me, and I don't have time to run trials right now.I do have other cases where I have 100s of thousands of rows that have a column that needs to be split, and it is never longer than 255 characters, so I could safely use the &amp;lt;=4k version for that job.</description><pubDate>Wed, 05 Sep 2012 14:51:30 GMT</pubDate><dc:creator>UMG Developer</dc:creator></item><item><title>RE: Performance issue with tally solution</title><link>http://www.sqlservercentral.com/Forums/Topic695508-338-1.aspx</link><description>[quote][b]UMG Developer (9/5/2012)[/b][hr][quote][quote]And, going deeper down the rabbit hole... How big are the strings in question?[/quote]I would say up to 25k characters, but probably averaging closer to 3k.[/quote]Ah, and there is the rub: You're over the 4k threshold, which means that you're stuck with the larger definition or you're hoping that the query optimizer gives you proper startup filters (and bothers to honor them).([url]https://connect.microsoft.com/SQLServer/feedback/details/757383/startup-predicates-not-honored-in-plans-with-parallel-nested-loops-subtrees[/url])</description><pubDate>Wed, 05 Sep 2012 14:46:13 GMT</pubDate><dc:creator>Adam Machanic</dc:creator></item><item><title>RE: Performance issue with tally solution</title><link>http://www.sqlservercentral.com/Forums/Topic695508-338-1.aspx</link><description>[quote][b]Adam Machanic (9/5/2012)[/b][hr]Well the only logical way to respond to that is to ask you why you're storing these strings in a table. Dare I ask?[/quote]I'm not storing them that way, that is the way the application stores the data, and is the only way for us to access it and report against it. I can store them split, but there is no flag that indicates which line(s) have been updated each day, so we have to re-split the entire table every time we need updated data. (I supposed we could keep hashes of each row or something, but it takes under an hour to split last time I ran it, so it wasn't that big of an issue.)[quote]And, going deeper down the rabbit hole... How big are the strings in question?[/quote]I would say up to 25k characters, but probably averaging closer to 3k.</description><pubDate>Wed, 05 Sep 2012 14:39:23 GMT</pubDate><dc:creator>UMG Developer</dc:creator></item><item><title>RE: Performance issue with tally solution</title><link>http://www.sqlservercentral.com/Forums/Topic695508-338-1.aspx</link><description>[quote][b]UMG Developer (9/5/2012)[/b][hr][quote]I have cases where I have a table with ~20 million rows and each row contains a string that needs to be split, so that is a case where the splitter would be called 20 million times in a single INSERT query, so any optimization to make it faster can make a big difference.[/quote]Well the only logical way to respond to that is to ask you why you're storing these strings in a table. Dare I ask?And, going deeper down the rabbit hole... How big are the strings in question?</description><pubDate>Wed, 05 Sep 2012 14:30:47 GMT</pubDate><dc:creator>Adam Machanic</dc:creator></item><item><title>RE: Performance issue with tally solution</title><link>http://www.sqlservercentral.com/Forums/Topic695508-338-1.aspx</link><description>[quote][b]Adam Machanic (9/5/2012)[/b][hr]How are you seeing string splits used where such a difference would matter? (Again, I am *only* talking about string split operations.)Generally when I see string splitters used it's for passing a set of arguments to a stored procedure, and it will end up in some query like:SELECT * FROM TableWHERE ID IN (SELECT CONVERT(INT, output) FROM dbo.StringSplitFunction(@values))[/quote]Adam,I have cases where I have a table with ~20 million rows and each row contains a string that needs to be split, so that is a case where the splitter would be called 20 million times in a single INSERT query, so any optimization to make it faster can make a big difference.</description><pubDate>Wed, 05 Sep 2012 14:09:05 GMT</pubDate><dc:creator>UMG Developer</dc:creator></item><item><title>RE: Performance issue with tally solution</title><link>http://www.sqlservercentral.com/Forums/Topic695508-338-1.aspx</link><description>[quote][b]Eugene Elutin (9/5/2012)[/b][hr]Yes, in my experience I have seen a lot of real-world database projects where such performance difference (on large data-sets) would easily justify the maintainability of such overhead. What we are talking here is just a declaration on a SQL Server, there is no much to maintain really.[/quote]How are you seeing string splits used where such a difference would matter? (Again, I am *only* talking about string split operations.)Generally when I see string splitters used it's for passing a set of arguments to a stored procedure, and it will end up in some query like:SELECT * FROM TableWHERE ID IN (SELECT CONVERT(INT, output) FROM dbo.StringSplitFunction(@values))In this case the string split function will be called once per query, not once per row, and the 9 microsecond difference will never amount to anything.This is not just a separate declaration that you need to maintain. You need to maintain separate calls, and somewhere your code needs to understand what to do with different strings.IF LEN(@values) &amp;lt;= 4000 THEN... /* call the version for small strings */ELSE... /* call the version for larger strings */And now we need to put these checks everywhere we do string splitting. Or perhaps you can wrap your two declarations in a single outer TVF, if you're very careful with that code and you trust the query optimizer to produce adequate startup filters so that both functions aren't called behind the scenes. And after all of that you still have three objects sitting in the database where you had only one before... </description><pubDate>Wed, 05 Sep 2012 13:07:44 GMT</pubDate><dc:creator>Adam Machanic</dc:creator></item><item><title>RE: Performance issue with tally solution</title><link>http://www.sqlservercentral.com/Forums/Topic695508-338-1.aspx</link><description>I have seen overhead with operations on NVARCHAR(MAX) on all types of CLR's eg. table-valued one. I think it is related to the way of memory reservation for MAX type. I will not be surprised if the real boundary is NVARCHAR(4000). [quote]... Have you seen a real-world workload would actually benefit enough to justify the maintainability overhead that having multiple split functions would introduce?[/quote]Yes, in my experience I have seen a lot of real-world database projects where such performance difference (on large data-sets) would easily justify the maintainability of such overhead. What we are talking here is just a declaration on a SQL Server, there is no much to maintain really.</description><pubDate>Wed, 05 Sep 2012 12:50:50 GMT</pubDate><dc:creator>Eugene Elutin</dc:creator></item><item><title>RE: Performance issue with tally solution</title><link>http://www.sqlservercentral.com/Forums/Topic695508-338-1.aspx</link><description>Interesting repro.I can see the same difference you do on both SQL Server 2008 and 2012, but I'm not sure the concerns are the same across problem domains. Your repro shows a scalar operation ("trim"), whereas this thread is primarily concerned with a table-valued operation ("split"). We're talking about what amounts to, on average, around a 9 microsecond difference per call (on my end -- a bit more on yours). When making a scalar call on every row of a table that certainly adds up, but table-valued functions, especially split, aren't used that way -- at least, in my experience. Have you seen a real-world workload would actually benefit enough to justify the maintainability overhead that having multiple split functions would introduce?</description><pubDate>Wed, 05 Sep 2012 10:50:14 GMT</pubDate><dc:creator>Adam Machanic</dc:creator></item><item><title>RE: Performance issue with tally solution</title><link>http://www.sqlservercentral.com/Forums/Topic695508-338-1.aspx</link><description>I took my function from this one: [url]http://www.sqlservercentral.com/Forums/Topic1348575-338-1.aspx[/url]CLR:[code="other"]using System.Data;using System.Data.SqlClient;using System.Data.SqlTypes;using Microsoft.SqlServer.Server;using System.Text.RegularExpressions;namespace CLRPlay{    public partial class UserDefinedFunctions    {        static readonly Regex _regex = new Regex(@"^[0]+|[0]+$",RegexOptions.Compiled);               [Microsoft.SqlServer.Server.SqlFunction]        public static SqlString TrimLTZr(SqlString val)        {            // Put your code here            if (val.IsNull) return SqlString.Null;            return _regex.Replace(val.ToString(), String.Empty);        }    };}[/code]Registration:[code="sql"]/****** Object:  UserDefinedFunction [dbo].[TrimLTZr]    Script Date: 09/05/2012 16:51:51 ******/CREATE FUNCTION [dbo].[TrimLTZr](@val [nvarchar](550))RETURNS [nvarchar](550) WITH EXECUTE AS CALLERAS EXTERNAL NAME [CLRPlay].[CLRPlay.UserDefinedFunctions].[TrimLTZr]GOCREATE FUNCTION [dbo].[TrimLTZrM](@val [nvarchar](max))RETURNS [nvarchar](max) WITH EXECUTE AS CALLERAS EXTERNAL NAME [CLRPlay].[CLRPlay.UserDefinedFunctions].[TrimLTZr]GO[/code]Test (as per Jeff Moden way of time measurement):[code="sql"]--generate 1,000,000 rows of test dataSELECT TOP 1000000    SomeString = REPLICATE('0',ABS(CHECKSUM(NEWID()))%5) -- 0 to 4 leading zeros  + 'temp'+RIGHT('0000000'+CAST(ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS VARCHAR(10)),7)INTO #TestTableFROM sys.all_columns ac1CROSS JOIN sys.all_columns ac2  --=======================================================================================--      Run the tests using a duration timer because SET STATISTICS TIME ON induces--      large delay times because of the RBAR nature of Scalar UDFs.--=======================================================================================--===== Declare some obviously named variables. DECLARE @BitBucket  VARCHAR(8000),        @StartTime  DATETIME,        @DurationMS INT    --===== Run the tests on each function        RAISERROR('=======================================',0,1) WITH NOWAIT;        RAISERROR('============ Running Tests ============',0,1) WITH NOWAIT;        RAISERROR('=======================================',0,1) WITH NOWAIT;     -- Note that @BitBucket allows us to take display and disk time out of the equation.----------------------------------------------------------------------------------------- SELECT @StartTime = GETDATE();        RAISERROR('============ TrimLTZr Varchar(550)',0,1) WITH NOWAIT; SELECT @BitBucket = dbo.TrimLTZr(SomeString)   FROM #TestTable; SELECT @DurationMS = DATEDIFF(ms,@StartTime,GETDATE())        RAISERROR('Duration ms: %u',0,1,@DurationMS) WITH NOWAIT;----------------------------------------------------------------------------------------- SELECT @StartTime = GETDATE();        RAISERROR('============ TrimLTZr Varchar(MAX)',0,1) WITH NOWAIT; SELECT @BitBucket = dbo.TrimLTZrM(SomeString)   FROM #TestTable; SELECT @DurationMS = DATEDIFF(ms,@StartTime,GETDATE())        RAISERROR('Duration ms: %u',0,1,@DurationMS) WITH NOWAIT;[/code]Results:[code="other"]=================================================== Running Tests =============================================================== TrimLTZr Varchar(550)Duration ms: 2610============ TrimLTZr Varchar(MAX)Duration ms: 14843[/code]Sorry for waiting....Please note: the input test data is the same for both functions and it is not NVARCHAR(MAX).</description><pubDate>Wed, 05 Sep 2012 10:04:46 GMT</pubDate><dc:creator>Eugene Elutin</dc:creator></item><item><title>RE: Performance issue with tally solution</title><link>http://www.sqlservercentral.com/Forums/Topic695508-338-1.aspx</link><description>[quote][b]Eugene Elutin (9/5/2012)[/b][hr]1. In short: Yes I can[/quote]Great, I look forward to seeing your repro.</description><pubDate>Wed, 05 Sep 2012 09:50:37 GMT</pubDate><dc:creator>Adam Machanic</dc:creator></item><item><title>RE: Performance issue with tally solution</title><link>http://www.sqlservercentral.com/Forums/Topic695508-338-1.aspx</link><description>[quote][b]Adam Machanic (9/5/2012)[/b][hr][quote][b]Eugene Elutin (9/5/2012)[/b][hr]You may not agree. But if you need to split NVARCHAR(max) and NVARCHAR(100), CLR function declared with NVARCHAR(100) as input parameter will perform faster than one declared with NVARCHAR(MAX).[/quote]Can you prove that?[quote]Same here. I've not stated that RegEx should be used for splitting strings or finding exact string within a string.  RegEx is not a tool for that. It is for finding a PATTERN.  Let say you want to find every occurrence of GUID (any GUID) within a string... I don't think you will get to far with "simple string comparison".[/quote]This is a thread about string splitting.[/quote]1. In short: Yes I can2. There are over 52 pages on this thread discussing many different aspects and not only  string splitting. I've just read through few pages and found few people mentioned using RegEx. So, was my advice...</description><pubDate>Wed, 05 Sep 2012 09:48:35 GMT</pubDate><dc:creator>Eugene Elutin</dc:creator></item><item><title>RE: Performance issue with tally solution</title><link>http://www.sqlservercentral.com/Forums/Topic695508-338-1.aspx</link><description>[quote][b]Eugene Elutin (9/5/2012)[/b][hr]You may not agree. But if you need to split NVARCHAR(max) and NVARCHAR(100), CLR function declared with NVARCHAR(100) as input parameter will perform faster than one declared with NVARCHAR(MAX).[/quote]Can you prove that?[quote]Same here. I've not stated that RegEx should be used for splitting strings or finding exact string within a string.  RegEx is not a tool for that. It is for finding a PATTERN.  Let say you want to find every occurrence of GUID (any GUID) within a string... I don't think you will get to far with "simple string comparison".[/quote]This is a thread about string splitting.</description><pubDate>Wed, 05 Sep 2012 09:37:43 GMT</pubDate><dc:creator>Adam Machanic</dc:creator></item><item><title>RE: Performance issue with tally solution</title><link>http://www.sqlservercentral.com/Forums/Topic695508-338-1.aspx</link><description>[quote]I do not agree that there is any point in trying to optimize for smaller strings. Especially given the streaming solution, I think you'd be hard pressed to find a case where you'd get any benefit from having more than one function. [/quote]You may not agree. But if you need to split NVARCHAR(max) and NVARCHAR(100), CLR function declared with NVARCHAR(100) as input parameter will perform faster than one declared with NVARCHAR(MAX).[quote]RegEx, same deal. We tried that upthread and found that it was actually slower than doing a simple string comparison. But maybe we missed something? Show us a test case where it's faster!In both cases I'd question the benefit over the obvious maintainability issues. One function per character on the compiled RegEx side, multiplied times one function each for small and large strings. That's a lot of functions. I'd rather just come up with one function that does everything as well as it possibly can. And I think we've done that already, but there's always room for a bit more optimization.[/quote]Same here. I've not stated that RegEx should be used for splitting strings or finding exact string within a string.  RegEx is not a tool for that. It is for finding a PATTERN.  Let say you want to find every occurrence of GUID (any GUID) within a string... I don't think you will get to far with "simple string comparison".My point was for cases where you do use RegEx for what is really best at.  And it is very valid point in terms of performance. Right now the only what we have in T-SQL is PATINDEX which can only find the pattern within a string. I think, even if MS will implement full RegEx kind of functionality in T-SQL (eg.RegEx.Replace method), using CLR where it's defined as static will be still faster.  </description><pubDate>Wed, 05 Sep 2012 09:28:27 GMT</pubDate><dc:creator>Eugene Elutin</dc:creator></item><item><title>RE: Performance issue with tally solution</title><link>http://www.sqlservercentral.com/Forums/Topic695508-338-1.aspx</link><description>[quote][b]Eugene Elutin (9/5/2012)[/b][hr]Also, I would like to add my two pence into  discussion of CLR performance:1. You may declare the same CLR function (operating with strings) multiple times in SQL. What will it give you? You will find that performance of VARCHAR(100) and VARCHAR(MAX) is quite different... So, the same function can be "optimised" for use with smaller strings.2. If you use RegEx in CLR, you will find that declaring it as static with compiled option boost its performance. Yes, it is less flexible as you can not pass regular expression as parameter, but it's significantly faster as RegEx object needs to initialise once.[/quote]I do not agree that there is any point in trying to optimize for smaller strings. Especially given the streaming solution, I think you'd be hard pressed to find a case where you'd get any benefit from having more than one function. RegEx, same deal. We tried that upthread and found that it was actually slower than doing a simple string comparison. But maybe we missed something? Show us a test case where it's faster!In both cases I'd question the benefit over the obvious maintainability issues. One function per character on the compiled RegEx side, multiplied times one function each for small and large strings. That's a lot of functions. I'd rather just come up with one function that does everything as well as it possibly can. And I think we've done that already, but there's always room for a bit more optimization.</description><pubDate>Wed, 05 Sep 2012 08:39:49 GMT</pubDate><dc:creator>Adam Machanic</dc:creator></item><item><title>RE: Performance issue with tally solution</title><link>http://www.sqlservercentral.com/Forums/Topic695508-338-1.aspx</link><description>[quote][b]paul.knibbs (9/5/2012)[/b][hr]What about a DOS-formatted text file where the delimiter at the end of lines is a carriage return/line feed pair? :-)[/quote]Not a problem. Choose either character as your delimiter. Both characters are considered to be white space (in .NET -- not in SQL Server for some reason), and so the character other than the one you've chosen will be trimmed off by the splitter.Next! :-D</description><pubDate>Wed, 05 Sep 2012 08:34:54 GMT</pubDate><dc:creator>Adam Machanic</dc:creator></item><item><title>RE: Performance issue with tally solution</title><link>http://www.sqlservercentral.com/Forums/Topic695508-338-1.aspx</link><description>Pong</description><pubDate>Wed, 05 Sep 2012 08:00:19 GMT</pubDate><dc:creator>allnelsons</dc:creator></item><item><title>RE: Performance issue with tally solution</title><link>http://www.sqlservercentral.com/Forums/Topic695508-338-1.aspx</link><description>test...</description><pubDate>Wed, 05 Sep 2012 07:36:10 GMT</pubDate><dc:creator>Adam Machanic</dc:creator></item><item><title>RE: Performance issue with tally solution</title><link>http://www.sqlservercentral.com/Forums/Topic695508-338-1.aspx</link><description>[quote][b]Eugene Elutin (9/5/2012)[/b][hr][quote][b]paul.knibbs (9/5/2012)[/b][hr][quote][b]Adam Machanic (9/4/2012)[/b]The code that follows is my crack at a more memory efficient solution. It works only for a single delimiter (I never did see the point of multi-character delimiters, to be honest)[/quote]The point of multi-character delimiters is to delimit strings which may contain delimiter as legitimate characters in itself. So, it's quite often that combination of non-alpha-numeric characters is used to delimit text data. And of course an example of mixed multi-line delimiters: CRLF, CR or LF or even LFCR.Also, I would like to add my two pence into  discussion of CLR performance:1. You may declare the same CLR function (operating with strings) multiple times in SQL. What will it give you? You will find that performance of VARCHAR(100) and VARCHAR(MAX) is quite different... So, the same function can be "optimised" for use with smaller strings.2. If you use RegEx in CLR, you will find that declaring it as static with compiled option boost its performance. Yes, it is less flexible as you can not pass regular expression as parameter, but it's significantly faster as RegEx object needs to initialise once.[/quote]Perhaps we actually agree on this but trying to find out.  Just like I said in my previous post and just like you implied in item 1 above, there are certain optimizations you can make for each type (single or multi-character) of delimiter just as you can for the different datatypes.In other words, while general purpose split code makes life easy for developers, general purpose code is almost never as fast as purposed code.  To wit, I don't believe that a splitter should be made to handle both single and multi-character delimiters just like I wouldn't simply change the DelimitedSplit8K function to handle NVARCHAR and blob data types.</description><pubDate>Wed, 05 Sep 2012 07:24:58 GMT</pubDate><dc:creator>Jeff Moden</dc:creator></item><item><title>RE: Performance issue with tally solution</title><link>http://www.sqlservercentral.com/Forums/Topic695508-338-1.aspx</link><description>[quote][b]paul.knibbs (9/5/2012)[/b][hr][quote][b]Adam Machanic (9/4/2012)[/b]The code that follows is my crack at a more memory efficient solution. It works only for a single delimiter (I never did see the point of multi-character delimiters, to be honest)[/quote]What about a DOS-formatted text file where the delimiter at the end of lines is a carriage return/line feed pair? :-)[/quote]Those are handled separately and don't require the use of multi-character delimiters unless you've made the mistake of trying to handle the whole file as a single blob.  If you are trying to handle the file as a single blob, then, yes, a multi-character delimiter splitter would be in order.  However, understand that splitting on a single delimiter is a performance optimization and that you should consider having two splitters.  One to handle single character delimiters in a very high speed fashion and one to handle multi-character delimiters at slightly lower speeds.</description><pubDate>Wed, 05 Sep 2012 06:56:45 GMT</pubDate><dc:creator>Jeff Moden</dc:creator></item><item><title>RE: Performance issue with tally solution</title><link>http://www.sqlservercentral.com/Forums/Topic695508-338-1.aspx</link><description>[quote][b]paul.knibbs (9/5/2012)[/b][hr][quote][b]Adam Machanic (9/4/2012)[/b]The code that follows is my crack at a more memory efficient solution. It works only for a single delimiter (I never did see the point of multi-character delimiters, to be honest)[/quote]The point of multi-character delimiters is to delimit strings which may contain delimiter as legitimate characters in itself. So, it's quite often that combination of non-alpha-numeric characters is used to delimit text data. And of course an example of mixed multi-line delimiters: CRLF, CR or LF or even LFCR.Also, I would like to add my two pence into  discussion of CLR performance:1. You may declare the same CLR function (operating with strings) multiple times in SQL. What will it give you? You will find that performance of VARCHAR(100) and VARCHAR(MAX) is quite different... So, the same function can be "optimised" for use with smaller strings.2. If you use RegEx in CLR, you will find that declaring it as static with compiled option boost its performance. Yes, it is less flexible as you can not pass regular expression as parameter, but it's significantly faster as RegEx object needs to initialise once.</description><pubDate>Wed, 05 Sep 2012 03:47:38 GMT</pubDate><dc:creator>Eugene Elutin</dc:creator></item><item><title>RE: Performance issue with tally solution</title><link>http://www.sqlservercentral.com/Forums/Topic695508-338-1.aspx</link><description>[quote][b]Adam Machanic (9/4/2012)[/b]The code that follows is my crack at a more memory efficient solution. It works only for a single delimiter (I never did see the point of multi-character delimiters, to be honest)[/quote]What about a DOS-formatted text file where the delimiter at the end of lines is a carriage return/line feed pair? :-)</description><pubDate>Wed, 05 Sep 2012 02:30:53 GMT</pubDate><dc:creator>paul.knibbs</dc:creator></item><item><title>RE: Performance issue with tally solution</title><link>http://www.sqlservercentral.com/Forums/Topic695508-338-1.aspx</link><description>[quote][b]Adam Machanic (9/4/2012)[/b][hr]Interestingly, I ran some tests on a machine that has both SQL Server 2008 R2 and SQL Server 2012 installed, and all of my test queries (for either version) run around 50% faster in 2012.[/quote]I have found the same thing with streaming SQLCLR functions - another good reason to upgrade I guess.  I have no idea why Microsoft don't make more of all the small improvements they work into every new major release.[quote]The code that follows is my crack at a more memory efficient solution. It works only for a single delimiter (I never did see the point of multi-character delimiters, to be honest), and also does a few other things that some people will hate, such as trimming the output and not returning empty results. I also bolted on a row number, which a number of people have requested for the last SQLCLR version that was posted.[/quote]Not returning empty results is the only one that strikes me as odd.[quote]Anyway, here's the code. Anyone feel like optimizing, testing, or otherwise kicking the tires a bit?[/quote]Some people will appreciate the bits as well as the source, so here's my release build targeting .NET 2.0 and tested on 2005, 2008, 2008 R2, and 2012:[code="sql"]CREATE ASSEMBLY SplitStringWithRowFROM 0x4D5A90000300000004000000FFFF0000B800000000000000400000000000000000000000000000000000000000000000000000000000000000000000800000000E1FBA0E00B409CD21B8014CCD21546869732070726F6772616D2063616E6E6F742062652072756E20696E20444F53206D6F64652E0D0D0A2400000000000000504500004C010300FA9646500000000000000000E00002210B010800000E00000006000000000000CE2D0000002000000040000000004000002000000002000004000000000000000400000000000000008000000002000000000000030040850000100000100000000010000010000000000000100000000000000000000000782D00005300000000400000D003000000000000000000000000000000000000006000000C00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000200000080000000000000000000000082000004800000000000000000000002E74657874000000D40D000000200000000E000000020000000000000000000000000000200000602E72737263000000D0030000004000000004000000100000000000000000000000000000400000402E72656C6F6300000C0000000060000000020000001400000000000000000000000000004000004200000000000000000000000000000000B02D00000000000048000000020005006C2300000C0A00000100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000D2026F1000000A2D15036F1000000A2D0D02036F1100000A73040000062A168D15000001731200000A168D1500000173040000062A000000133002002A000000010000110274040000020A03066F0A000006281300000A810400000104066F09000006281400000A81050000012A1E02281500000A2A000003300300420000000000000002281500000A02037D0800000402048E692C050416932B021F207D0900000402204C1D00008D150000017D0A000004020273080000067D0B0000040228070000062A1E027B0B0000042A000013300600DF01000002000011027E1600000A7D07000004160A38C2000000027B0A000004027B0300000493027B090000044099000000027B03000004027B02000004316D027B070000046F1700000A16313602027B07000004027B0A000004027B02000004027B03000004027B0200000459731800000A281900000A6F1A00000A7D070000042B2902027B0A000004027B02000004027B03000004027B0200000459731800000A281A00000A7D07000004027B070000046F1700000A163102170A02027B0300000417587D0200000402257B0300000417587D03000004062D11027B03000004027B040000043F2DFFFFFF063AE4000000027B03000004027B0400000433DB027B070000046F1700000A16313102027B07000004027B0A000004027B02000004027B03000004027B0200000459731800000A281900000A7D070000042B2402027B0A000004027B02000004027B03000004027B0200000459731800000A7D0700000402167D0300000402167D0200000402027B08000004027B050000046A027B0A00000416204C1D00006F1B00000A69250B7D04000004072D2302027B070000046F1A00000A7D07000004027B070000046F1700000A16311C170A2B1802257B05000004027B04000004587D050000043805FFFFFF02257B0600000417587D06000004062AD202204C1D00007D0200000402204C1D00007D0300000402204C1D00007D0400000402167D05000004027E1600000A7D070000042A3A02281500000A02037D0C0000042A32027B0C0000047B060000042A32027B0C0000047B070000042A00000042534A4201000100000000000C00000076322E302E35303732370000000005006C00000098030000237E0000040400002004000023537472696E6773000000002408000008000000235553002C0800001000000023475549440000003C080000D001000023426C6F620000000000000002000001571FA2010902000000FA2533001600000100000016000000040000000C0000000A00000008000000010000001B000000010000000D0000000200000002000000030000000300000001000000020000000200000000000A0001000000000006005A0053000600740061000A00B0009B000A00CC009B000A00D6009B000A00D801BD0106001C02FD0106004902370206006002370206007D02370206009C0237020600B50237020600CE0237020600E902370206000403370206001D03FD0106003103370206006A034A0306008A034A030A00BB03BD010600E50353000600F6035300000000000100000000000100010001001000210000000500010001000200100036000000050002000400020010004200000005000C000800518080000A00010018010A00010020010A00010028010A00010033010A0001003F010A00010049013A00210050013D0021005A01410021006401440021006C01480021007E0156005020000000009600B900120001008820000000009600DF001B000300BE20000000008618F70026000600C820000000008618F7002A000600162100000000E609FD0032000800202100000000E6010901360008000B2300000000E6011201260008004023000000008618F700500008004F2300000000860881015A0009005C230000000086088F015E00090000000100B70100000200EA0100000100F40102000200F801020003002902000001002D0200000200EA01000001007E01030009003100F70026003900F70026004100F70090004900F70090005100F70090005900F70090006100F70090006900F70090007100F70090007900F70090008100F70095008900F70090009100F7009A009900F7002600A100F70026001900D00336001900DB0321011900F70026012100EA032C012900EA0332010900F7002600B100FD033A00B10003045A00B100F7003D01B1000E044501B10015045E0019001A044B01080004000D0020007B009F0024000B006A002E00330077012E001B0059012E00230071012E002B0071012E004B0071012E007300B0012E00430086012E003B0059012E005B0071012E006B00A70144000B007D00380154010300010004000200000076014C000000A00162000000AA01660002000500030002000900050002000A00070004800000010000001712E55B000000000000A803000002000000000000000000000001004A000000000002000000000000000000000001008F000000000003000200040003000000003C4D6F64756C653E0053706C6974537472696E674368756E6B65642E646C6C0055736572446566696E656446756E6374696F6E7300737472696E6753706C697400726573756C7473006D73636F726C69620053797374656D004F626A6563740053797374656D2E436F6C6C656374696F6E730049456E756D657261746F7200535542535452494E475F53495A450053797374656D2E446174610053797374656D2E446174612E53716C54797065730053716C43686172730053706C6974537472696E675F53696E676C650053716C537472696E670053716C496E7433320053706C6974537472696E675F53696E676C655F46696C6C002E63746F72006765745F43757272656E74004D6F76654E657874005265736574006C617374506F73006E657874506F73007365676D656E744C656E006E6578745365676D656E7400726F774E756D62657200726573756C7400746865537472696E670064656C696D69746572007365676D656E7400746865526573756C740043757272656E74007373006765745F526F774E756D626572006765745F4F7574707574537472696E6700526F774E756D626572004F7574707574537472696E6700496E707574004D6963726F736F66742E53716C5365727665722E5365727665720053716C46616365744174747269627574650044656C696D69746572006F626A006974656D0053797374656D2E52756E74696D652E496E7465726F705365727669636573004F757441747472696275746500726F7700546865537472696E670053797374656D2E5265666C656374696F6E00417373656D626C795469746C6541747472696275746500417373656D626C794465736372697074696F6E41747472696275746500417373656D626C79436F6E66696775726174696F6E41747472696275746500417373656D626C79436F6D70616E7941747472696275746500417373656D626C7950726F6475637441747472696275746500417373656D626C79436F7079726967687441747472696275746500417373656D626C7954726164656D61726B41747472696275746500417373656D626C7943756C7475726541747472696275746500436F6D56697369626C6541747472696275746500417373656D626C7956657273696F6E4174747269627574650053797374656D2E52756E74696D652E436F6D70696C6572536572766963657300436F6D70696C6174696F6E52656C61786174696F6E734174747269627574650052756E74696D65436F6D7061746962696C6974794174747269627574650053706C6974537472696E674368756E6B65640053716C46756E6374696F6E417474726962757465006765745F49734E756C6C006765745F56616C75650043686172006F705F496D706C6963697400537472696E6700456D707479006765745F4C656E67746800436F6E636174005472696D00526561640000000320000000000046F581EED4391E48810AFC83870C8AA30008B77A5C561934E089020608044C1D00000800021209120D120D0A0003011C1011111011150320000107200201120D1D030320001C0320000202060E0306120D02060303061D03030612100328001C05200101120C0306120C032000080320000E032800080328000E12010001005408074D617853697A65FFFFFFFF12010001005408074D617853697A6501000000042001010E04200101020420010108808001000300540E1146696C6C526F774D6574686F644E616D651753706C6974537472696E675F53696E676C655F46696C6C540E0F5461626C65446566696E6974696F6E204F7574506172616D206E766172636861722834303030292C20526F7720494E54540E044E616D651573706C69745F737472696E675F776974685F726F770420001D03052001011D0305000111110E0500011115080407011210072003011D0308080500020E0E0E0820040A0A1D03080804070202081701001253706C6974537472696E674368756E6B656400000501000000000E0100094D6963726F736F667400002001001B436F7079726967687420C2A9204D6963726F736F6674203230313200000801000800000000001E01000100540216577261704E6F6E457863657074696F6E5468726F77730100A02D00000000000000000000BE2D0000002000000000000000000000000000000000000000000000B02D000000000000000000000000000000005F436F72446C6C4D61696E006D73636F7265652E646C6C0000000000FF2500204000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000100100000001800008000000000000000000000000000000100010000003000008000000000000000000000000000000100000000004800000058400000780300000000000000000000780334000000560053005F00560045005200530049004F004E005F0049004E0046004F0000000000BD04EFFE0000010000000100E55B171200000100E55B17123F000000000000000400000002000000000000000000000000000000440000000100560061007200460069006C00650049006E0066006F00000000002400040000005400720061006E0073006C006100740069006F006E00000000000000B004D8020000010053007400720069006E006700460069006C00650049006E0066006F000000B4020000010030003000300030003000340062003000000034000A00010043006F006D00700061006E0079004E0061006D006500000000004D006900630072006F0073006F00660074000000500013000100460069006C0065004400650073006300720069007000740069006F006E0000000000530070006C006900740053007400720069006E0067004300680075006E006B00650064000000000040000F000100460069006C006500560065007200730069006F006E000000000031002E0030002E0034003600330031002E00320033003500320035000000000050001700010049006E007400650072006E0061006C004E0061006D0065000000530070006C006900740053007400720069006E0067004300680075006E006B00650064002E0064006C006C00000000005C001B0001004C006500670061006C0043006F007000790072006900670068007400000043006F0070007900720069006700680074002000A90020004D006900630072006F0073006F006600740020003200300031003200000000005800170001004F0072006900670069006E0061006C00460069006C0065006E0061006D0065000000530070006C006900740053007400720069006E0067004300680075006E006B00650064002E0064006C006C0000000000480013000100500072006F0064007500630074004E0061006D00650000000000530070006C006900740053007400720069006E0067004300680075006E006B00650064000000000044000F000100500072006F006400750063007400560065007200730069006F006E00000031002E0030002E0034003600330031002E00320033003500320035000000000048000F00010041007300730065006D0062006C0079002000560065007200730069006F006E00000031002E0030002E0034003600330031002E003200330035003200350000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000002000000C000000D03D00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000WITH PERMISSION_SET = SAFE;[/code][code="sql"]CREATE FUNCTION dbo.SplitStringWithRow(    @Input      nvarchar(max),    @Delimiter  nvarchar(1))RETURNS  TABLE(    Item    nvarchar(4000) NULL,	RowNo   integer NULL)AS EXTERNAL NAME SplitStringWithRow.UserDefinedFunctions.SplitString_Single;[/code]</description><pubDate>Tue, 04 Sep 2012 18:07:36 GMT</pubDate><dc:creator>Paul White</dc:creator></item><item><title>RE: Performance issue with tally solution</title><link>http://www.sqlservercentral.com/Forums/Topic695508-338-1.aspx</link><description>Alright kids, it's time for another round of fun...Last week I started playing with a streaming CLR solution, hoping to come up with something that would be a bit more memory efficient. I was able to do just that, but the bad news is that better memory efficiency -- at least for my current code -- means somewhat worse performance. Not a whole lot worse, mind you, maybe 10-15%, but still, worse.Interestingly, I ran some tests on a machine that has both SQL Server 2008 R2 and SQL Server 2012 installed, and all of my test queries (for either version) run around 50% faster in 2012. Kind of an interesting bonus there, and I have no idea what changed. But no complaints.The code that follows is my crack at a more memory efficient solution. It works only for a single delimiter (I never did see the point of multi-character delimiters, to be honest), and also does a few other things that some people will hate, such as trimming the output and not returning empty results. I also bolted on a row number, which a number of people have requested for the last SQLCLR version that was posted.What's more memory efficient about it? Instead of reading the whole string in at once it uses the .Read method on the SqlChars instance to read 7500 characters at a time (that number is configurable via a constant set at the top). I played around with other numbers but was unable to come up with much in terms of conclusive evidence that the choice makes a difference. Really low numbers -- like 10 -- and really high numbers -- like 75000 -- definitely impact the return times, but aside from that it doesn't seem to matter.I am measuring memory utilization primarily using the handy total_allocated_memory_kb column that's been added to sys.dm_clr_appdomains in SQL Server 2012, and I can say that this code does use less memory overall, but not much less. What I haven't been able to measure yet is how much PEAK memory is being used. That will require perf counters or something else and to be honest I've been too lazy to bother quite yet :-)Anyway, here's the code. Anyone feel like optimizing, testing, or otherwise kicking the tires a bit?[code="other"]using System;using System.Collections;using System.Data;using System.Data.SqlClient;using System.Data.SqlTypes;using Microsoft.SqlServer.Server;public partial class UserDefinedFunctions{    //It's unclear what the penalty for setting this too high or too low is    //for the moment I'm setting it to a point where it will help to avoid LOB heap allocations    const int SUBSTRING_SIZE = 7500;    [Microsoft.SqlServer.Server.SqlFunction(       FillRowMethodName = "SplitString_Single_Fill",       TableDefinition = "OutParam nvarchar(4000), Row INT",       Name = "split_string_with_row")]    public static IEnumerator SplitString_Single(      [SqlFacet(MaxSize = -1)]      SqlChars Input,      [SqlFacet(MaxSize = 1)]      SqlChars Delimiter)    {        return (            (Input.IsNull || Delimiter.IsNull) ?            new stringSplit(new SqlChars(new char[0]), new char[0]) :            new stringSplit(Input, Delimiter.Value));    }    public static void SplitString_Single_Fill(object obj, out SqlString item, out SqlInt32 row)    {        var r = (stringSplit.results)obj;        item = r.OutputString;        row = r.RowNumber;    }    public class stringSplit : IEnumerator    {        public stringSplit(SqlChars TheString, char[] Delimiter)        {            theString = TheString;            delimiter = (Delimiter.Length == 0 ? ' ' : Delimiter[0]);            segment = new char[SUBSTRING_SIZE];            theResult = new results(this);            this.Reset();        }        #region IEnumerator Members        public object Current        {            get            {                return this.theResult;            }        }        public bool MoveNext()        {            result = String.Empty;            bool foundNext = false;            while (true)            {                while (nextPos &amp;lt; segmentLen)                {                    if (segment[nextPos] == delimiter)                    {                        //ignore consecutive delimiters                        if (nextPos &amp;gt; lastPos)                        {                            if (result.Length &amp;gt; 0)                                result = String.Concat(result, new string(segment, lastPos, nextPos - lastPos)).Trim();                            else                                result = new string(segment, lastPos, nextPos - lastPos).Trim();                        }                        if (result.Length &amp;gt; 0)                            foundNext = true;                            lastPos = nextPos + 1;                    }                    nextPos++;                    if (foundNext)                        break;                }                if (foundNext)                    break;                else if (nextPos == segmentLen)                {                    //do not trim here -- we don't know if spaces here are trailing or if there's more content in the next segment                    if (result.Length &amp;gt; 0)                        result = String.Concat(result, new string(segment, lastPos, nextPos - lastPos));                    else                        result = new string(segment, lastPos, nextPos - lastPos);                                        nextPos = 0;                    lastPos = 0;                    //Is there anything left to read?                    if (0 == (segmentLen = (int)theString.Read(nextSegment, segment, 0, SUBSTRING_SIZE)))                    {                        //final trim to remove trailing spaces at the end                        result = result.Trim();                        if (result.Length &amp;gt; 0)                            foundNext = true;                        break;                    }                    else                        nextSegment += segmentLen;                }            }            rowNumber++;            return (foundNext);        }        public void Reset()        {            lastPos = SUBSTRING_SIZE;            nextPos = SUBSTRING_SIZE;            segmentLen = SUBSTRING_SIZE;            nextSegment = 0;            result = String.Empty;        }        #endregion        private int lastPos;        private int nextPos;        private int segmentLen;        private int nextSegment;        private int rowNumber = 0;        private string result;        private readonly SqlChars theString;        private readonly char delimiter;        private readonly char[] segment;        private readonly results theResult;        public class results        {            public results(stringSplit ss)            {                this.ss = ss;            }            private readonly stringSplit ss;            public int RowNumber            {                get                {                    return (ss.rowNumber);                }            }            public string OutputString            {                get                {                    return (ss.result);                }            }        }    }};[/code]</description><pubDate>Tue, 04 Sep 2012 16:04:09 GMT</pubDate><dc:creator>Adam Machanic</dc:creator></item><item><title>RE: Performance issue with tally solution</title><link>http://www.sqlservercentral.com/Forums/Topic695508-338-1.aspx</link><description>[quote][b]John Nelson-327605 (10/18/2011)[/b][hr]So...I'm liking the DelimitedSplit8K, but I need to increase it to an unknown number between 25K and 150K rows.  Any way to modify this so I can go that high without running into a performance wall?[/quote]My apologies... this got lost in all of the emails I get.The DelimitedSplit8k function can be easily used with a CROSS APPLY as demonstrated in the "Tally OH" article previously cited to handle virtually any number of rows that need up to VARCHAR(8000) columns to be split.  Of course, the CLR will outperform the DelimitedSplit8k function by about 2:1.</description><pubDate>Thu, 24 Nov 2011 08:22:33 GMT</pubDate><dc:creator>Jeff Moden</dc:creator></item><item><title>RE: Performance issue with tally solution</title><link>http://www.sqlservercentral.com/Forums/Topic695508-338-1.aspx</link><description>So...I'm liking the DelimitedSplit8K, but I need to increase it to an unknown number between 25K and 150K rows.  Any way to modify this so I can go that high without running into a performance wall?</description><pubDate>Tue, 18 Oct 2011 22:21:37 GMT</pubDate><dc:creator>allnelsons</dc:creator></item><item><title>RE: Performance issue with tally solution</title><link>http://www.sqlservercentral.com/Forums/Topic695508-338-1.aspx</link><description>Now THAT's funny!  :-)</description><pubDate>Sun, 25 Sep 2011 08:47:37 GMT</pubDate><dc:creator>Jeff Moden</dc:creator></item><item><title>RE: Performance issue with tally solution</title><link>http://www.sqlservercentral.com/Forums/Topic695508-338-1.aspx</link><description>:-DThanks Jeff.  I've been following this thread for years.  It's where I keep my egg recipe.</description><pubDate>Sun, 25 Sep 2011 06:45:58 GMT</pubDate><dc:creator>Bob Lee</dc:creator></item><item><title>RE: Performance issue with tally solution</title><link>http://www.sqlservercentral.com/Forums/Topic695508-338-1.aspx</link><description>Just in case anyone is still following this thread and may have missed the article... the performance problem wasn't with the Tally Table itself.  It was with the concatenation of delimiters.  Please see the article at the following link...[url]http://www.sqlservercentral.com/articles/Tally+Table/72993/[/url]</description><pubDate>Sat, 24 Sep 2011 20:52:16 GMT</pubDate><dc:creator>Jeff Moden</dc:creator></item><item><title>RE: Performance issue with tally solution</title><link>http://www.sqlservercentral.com/Forums/Topic695508-338-1.aspx</link><description>That was an eggseptionally good pun, Jeff.</description><pubDate>Wed, 09 Sep 2009 13:21:58 GMT</pubDate><dc:creator>The Dixie Flatline</dc:creator></item><item><title>RE: Performance issue with tally solution</title><link>http://www.sqlservercentral.com/Forums/Topic695508-338-1.aspx</link><description>Eggsactly...</description><pubDate>Wed, 09 Sep 2009 13:06:10 GMT</pubDate><dc:creator>Jeff Moden</dc:creator></item></channel></rss>