Nice job, Alan.
One question: what's the CHECKSUM() for? It doesn't appear to change the result from ROW_NUMBER().
PatExclude8K and PatReplace8K evolved from the thread at the end of this comment where Lowell was looking for tips to develop an itvf_strip_nonnumeric function. (Check it out if you have not seen the thread - it's a great stuff, I learned a ton).
The original code for the tally table part looked like this:
SELECT TOP (LEN(@String)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E1 a....)
Eirikur changed the tally table part to deal with the implicit conversation in the TOP clause like this:
SELECT TOP (CONVERT(BIGINT,LEN(@String),0)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E1 a....)
Jeff Moden further enhanced it by removing the conversion to BIGINT in the TOP clause and doing the CHECKSUM over ROW_NUMBER(). He explains the change in the comment section of his updated function:
2. CHECKSUM returns an INT and will return the exact number given if given an INT to begin with. It's also faster than a CAST or CONVERT and is used as a performance enhancer by changing the BIGINT of ROW_NUMBER() to a more appropriately sized INT.
Based on my testing, the function appears to perform a bit faster using CHECKSUM over ROW_NUMBER() instead of the conversion to BIGINT in the TOP clause.
-- Alan Burstein
Helpful links:Best practices for getting help on SQLServerCentral -- Jeff ModenHow to Post Performance Problems -- Gail ShawNasty fast set-based string manipulation functions:For splitting strings try DelimitedSplit8K or DelimitedSplit8K_LEAD (SQL Server 2012+)To split strings based on patterns try PatternSplitCMNeed to clean or transform a string? try NGrams, PatExclude8K, PatReplace8K, DigitsOnlyEE, or Translate8KI cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code. -- Itzik Ben-Gan 2001