Alan.B (6/25/2014)
Sean Lange (6/25/2014)
Alan.B (6/25/2014)
GentsFirst: Dang it Sean - you beet me to it! I had a very similar solution but you posted yours first. That said, I did do a performance test and ran into something I cannot figure out.
Nice. I didn't see your test data and haven't had a chance to roll my own yet. I am guessing that the big difference is that my code uses STUFF and yours does not.
If you still have your test harness handy try with this.
ALTER FUNCTION [dbo].[StripNonNumeric_itvf](@OriginalText VARCHAR(8000))
RETURNS TABLE WITH SCHEMABINDING AS
return
WITH
E1(N) AS (select 1 from (values (1),(1),(1),(1),(1),(1),(1),(1),(1),(1))dt(n)),
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
Tally(N) AS
(
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
)
, clean_text(CleanedText) as
(
SELECT SUBSTRING(@OriginalText, t.N, 1)
FROM tally t
WHERE ASCII(SUBSTRING(@OriginalText, t.N, 1)) BETWEEN 48 AND 57
FOR XML PATH('')
)
select CleanedText
from clean_text
What you posted was getting the same results (31 seconds)... I added this to the where clause:
AND n <= len(@OriginalText)
Final Version
ALTER FUNCTION [dbo].[StripNonNumeric_itvf_sean2](@OriginalText VARCHAR(8000))
RETURNS TABLE WITH SCHEMABINDING AS
return
WITH
E1(N) AS (select 1 from (values (1),(1),(1),(1),(1),(1),(1),(1),(1),(1))dt(n)),
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
Tally(N) AS
(
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
)
, clean_text(CleanedText) as
(
SELECT SUBSTRING(@OriginalText, t.N, 1)
FROM tally t
WHERE ASCII(SUBSTRING(@OriginalText, t.N, 1)) BETWEEN 48 AND 57
AND n <= len(@OriginalText) -- added by ajb
FOR XML PATH('')
)
select CleanedText
from clean_text
GO
It now runs in Milliseconds, same speed as mine. ~30 seconds for a million rows on my system using the test harness I posted earlier.
Edit: Some of my code got turned into HTML tags.
That makes sense. No point evaluating all the 10,000 rows when they won't meet the condition anyway. I suspect if we changed my original (with the STUFF) it would perform much the same with the inclusion of the additional predicate in the where clause. Awesome work Alan.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/