|
|
|
SSCrazy
      
Group: General Forum Members
Last Login: Sunday, November 04, 2012 12:23 PM
Points: 2,087,
Visits: 3,932
|
|
|
|
|
|
Hall of Fame
       
Group: General Forum Members
Last Login: Today @ 12:32 AM
Points: 3,574,
Visits: 5,112
|
|
jcrawf02 (11/18/2009)
Thought I'd tell you, you guys are just fun to watch. 
And provide significant community benefit too!!
Best,
Kevin G. Boles SQL Server Consultant SQL MVP 2007-2012 TheSQLGuru at GMail
|
|
|
|
|
SSC-Dedicated
           
Group: General Forum Members
Last Login: Today @ 6:23 AM
Points: 32,903,
Visits: 26,784
|
|
C# Screw (11/18/2009) Jeff, as promised : all tests combined SQL & CLR
Beginning execution loop Batch execution completed 10000 times. SQL:Looping 2536 SQL: using ||*9*9|| 6310 SQL: Jeffs single char 2916 CLR: looping 453 CLR: Not looping using ||*9*9|| technique in C# 296 CLR: Not looping using Jeff's single char technique in C# 416 <-- consistently c# performs slower single space bell character, surprising? Correction : Noticed one space missing in c# CLR: Not looping using Jeff's single char technique in C# 246
I am really sorry the c# input has caused some upset, I thought it was really useful/interesting.. so 10/10 article & productive discussion.
Oh no... don't be sorry at all about that. I just wanted to be sure what we were testing against. Thank you very much for taking that time. And thanks for the 10-by marks.
--Jeff Moden "RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".
First step towards the paradigm shift of writing Set Based code: Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."
For better, quicker answers on T-SQL questions, click on the following... http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following... http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
|
|
|
|
|
Hall of Fame
       
Group: General Forum Members
Last Login: Today @ 12:32 AM
Points: 3,574,
Visits: 5,112
|
|
GSquared (11/18/2009)
Nadrek (11/18/2009)
Jeff Moden (11/17/2009) You and another poster said about the same thing... I'm going to have to give it a try. Although it's not likely to have 8000 spaces (almost silly to even think that), I'll have to expand your good code example to 13 replaces so they're functionally identical so far as VARCHAR(8000) goes and see how that performs, as well. Not likely, perhaps... but very common to have about that many if someone started out with a CHAR(8000), even worse if they did concatenation on a CHAR(8000)+ another VARCHAR or CHAR + LEFT() or RIGHT() to get back to 8000. Some forms of data import also lead to excessive space padding. Are they good SQL? No. Do they happen? I believe they do. Also, yes, GSquared and I do apparently use some similar methods. I actually discovered this method almost 20 years ago, when I was building WordPerfect macros to clean up documents that had been typed up in an ASCII text editor. Used it to clean up spaces, underlines, even page-lines created by entering arbitrary numbers of hyphens. Came in very, very handy.
I'll be damned - someone else who did macro-based document management using WordPerfect macros! How kewl is that!? I had 2.5M of macros that did a full-fledged work package management system that included change bar and version tracking, etc. Tied into a dbase IV database system too!
Best,
Kevin G. Boles SQL Server Consultant SQL MVP 2007-2012 TheSQLGuru at GMail
|
|
|
|
|
SSC-Dedicated
           
Group: General Forum Members
Last Login: Today @ 6:23 AM
Points: 32,903,
Visits: 26,784
|
|
Michael Meierruth (11/18/2009)
Earlier on in this post I mentioned the use of a binary approach. I finally found some time to try this out. declare @s varchar(8000) set @s = '*' + replicate(' ',7998) + '*' select len(@s) set @s = replace(replace(replace(replace(replace(replace(ltrim(rtrim(@s)), replicate(' ',32+1),' '),replicate(' ',16+1),' '),replicate(' ',8+1),' '),replicate(' ',4+1),' '),replicate(' ',2+1),' '),replicate(' ',1+1),' ') select len(@s) select @s
Needless to say I have tried this with Jeff's data as well as with some random data and it works just fine. As for performance, I generated 100000 records and Jeff's code did in 110 seconds whereas my code did in 16 seconds. I shaved off a second by replacing the calls to replicate (which I used for clarity) with hardcoded constants or variables. Oh yes, Jeff's code ran at 100% CPU whereas my code never reached 100%. There is a recursive effect going on which I'm still trying to understand from a methematical point of view, i.e. for a given N+1 spaces what is the maximum length of blanks that can be reduce to 1 blank using the above approach. Of course, if someone can find a string with blanks where this doesn't work I will quietly shut up.
Whoa! Very cool... I'm going to have to try that one for sure. Thanks, Michael!
--Jeff Moden "RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".
First step towards the paradigm shift of writing Set Based code: Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."
For better, quicker answers on T-SQL questions, click on the following... http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following... http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
|
|
|
|
|
SSCrazy
      
Group: General Forum Members
Last Login: Yesterday @ 12:09 PM
Points: 2,550,
Visits: 18,882
|
|
TheSQLGuru (11/18/2009)
GSquared (11/18/2009)
Nadrek (11/18/2009)
Jeff Moden (11/17/2009) You and another poster said about the same thing... I'm going to have to give it a try. Although it's not likely to have 8000 spaces (almost silly to even think that), I'll have to expand your good code example to 13 replaces so they're functionally identical so far as VARCHAR(8000) goes and see how that performs, as well. Not likely, perhaps... but very common to have about that many if someone started out with a CHAR(8000), even worse if they did concatenation on a CHAR(8000)+ another VARCHAR or CHAR + LEFT() or RIGHT() to get back to 8000. Some forms of data import also lead to excessive space padding. Are they good SQL? No. Do they happen? I believe they do. Also, yes, GSquared and I do apparently use some similar methods. I actually discovered this method almost 20 years ago, when I was building WordPerfect macros to clean up documents that had been typed up in an ASCII text editor. Used it to clean up spaces, underlines, even page-lines created by entering arbitrary numbers of hyphens. Came in very, very handy. I'll be damned - someone else who did macro-based document management using WordPerfect macros! How kewl is that!? I had 2.5M of macros that did a full-fledged work package management system that included change bar and version tracking, etc. Tied into a dbase IV database system too!  Pipe down, us younguns can't hear the discussion over the creaking of your old bones...
--------------------------------------------------------- How best to post your question How to post performance problems Tally Table:What it is and how it replaces a loop
"stewsterl 80804 (10/16/2009)I guess when you stop and try to understand the solution provided you not only learn, but save yourself some headaches when you need to make any slight changes."
|
|
|
|
|
SSCrazy
      
Group: General Forum Members
Last Login: Yesterday @ 12:09 PM
Points: 2,550,
Visits: 18,882
|
|
|
|
|
|
SSC-Dedicated
           
Group: General Forum Members
Last Login: Today @ 6:23 AM
Points: 32,903,
Visits: 26,784
|
|
vliet (11/18/2009) Hello Jeff,
Thank you so much for posting this solution! We will apply your method in a SSIS Derived Column Transformation on data coming from raw files. This data needs to be cleansed before it even reaches the SQL Server staging tables, so we can not use CLR functions or SQL functions.
I will let you know how it performs, I'll try to use the horizontal tab (ASCII code 9) as the special character because this character will be replaced by a single space in this same transformation step. I'm curious about the performance in this context, but I must admit that your solution is much more elegant than my initial version with the 65, 33, 17, 9, 5, 3 an 2 spaces nested REPLACE construction. Your method can be applied on very wide columns without a limit to the number of contiguous spaces, and all it needs is one special character.
We can use the same expression for every string column that needs to be cleansed and we only need to replace the column name once. This is an aspect of your method that comes in handy in real life situations where you need to cleans more than a few columns of a few tables.
Keep up the good work!
Dony.
Hi Dony... thanks for the great feedback. BUT... as so often happens (it's almost designed to happen because so many good people get involved), the method I posted has been soundly beaten by, you guessed it, a nested replace very similar to what you said. Please see the following post on this thread....
http://www.sqlservercentral.com/Forums/Topic819042-203-3.aspx#BM820760
--Jeff Moden "RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".
First step towards the paradigm shift of writing Set Based code: Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."
For better, quicker answers on T-SQL questions, click on the following... http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following... http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
|
|
|
|
|
SSC-Dedicated
           
Group: General Forum Members
Last Login: Today @ 6:23 AM
Points: 32,903,
Visits: 26,784
|
|
Paul White (11/18/2009)
Hey everyone, just a quick note before I go to work this morning. I love the way this thread it going - huge thanks especially to C# Screw (lol!) and Flo for tidying my (still slightly amateur) C# as usual! I have a fairly busy morning ahead but will look in this afternoon - no doubt by then someone will have found an improvement which runs in negative time  I hope Jeff pops by soon because I'm really interested in the huge performance difference (10x) I found working with the different collations with his cool method. Paul
I'm at work right now so I can only provide quick "verbal" answers... can't take the time to test any code at work. I'll give the collation code a test tonight... very interesting find you made, Paul. Heh... more proof that SQL is NOT SQL.... well, unless it's Tuesday, it's raining, and you have precisely 3 pork chops on the thaw.
--Jeff Moden "RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".
First step towards the paradigm shift of writing Set Based code: Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."
For better, quicker answers on T-SQL questions, click on the following... http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following... http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
|
|
|
|
|
SSC Veteran
      
Group: General Forum Members
Last Login: Wednesday, May 08, 2013 3:35 AM
Points: 224,
Visits: 629
|
|
I am wondering if most usefull/usable Community result should include strip multiple Tab characters as well?
Thanks Screw
10k rows Each execution completed 10 times.
SQL function: Replace Technique : but using replacement chars ||*9*9|| 2635 SQL function :Brigzy (C#Screw) Looping 2517 SQL function: Jeff Original : single bell char 1884 CLR: Brigzy (C#Screw) Looping 332 SQL function: Michael Meierruth Original 332 CLR: Not looping using ||*9*9|| technique in C# 312 SQL function: Michael Meierruth : with hard coded spaces 304 CLR: using Jeff's single char technique in C# 243 CLR: C#Screw II - Spaces and TABS 181 CLR :using JCBnew space replacer C# 160 CLR: Michael Meierruth - C#Screw conversion: hard coded spaces 143 CLR: using Flo s space replacer C# 121 CLR: using Paul White space replacer C# 105
[SqlFunction()] public static string fn_TidySpace_SharpScrewII_CLR(string s) { StringBuilder sb = new StringBuilder();
string[] parts = s.Split(new char[] { ' ','\t' }, StringSplitOptions.RemoveEmptyEntries);
int size = parts.Length;
for (int i = 0; i < size; i++)
sb.AppendFormat("{0} ", parts[i]);
return sb.ToString();
}
C# Gnu ____________________________________________________
Excel Database Tasks V2 .. Multi user data edit Task based solution - Excel with SQL Server / Access / MySQL ..
Version 2.5 released - See Blog
|
|
|
|