SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Remove non printable characters


Remove non printable characters

Author
Message
Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (212K reputation)SSC Guru (212K reputation)SSC Guru (212K reputation)SSC Guru (212K reputation)SSC Guru (212K reputation)SSC Guru (212K reputation)SSC Guru (212K reputation)SSC Guru (212K reputation)

Group: General Forum Members
Points: 212531 Visits: 41977


Thanks for the link and welcome aboard! Shifting gears, please be sure to test that code from that link for performance before you think of using it because it's likely a whole lot slower than the code already posted on this thread because of the scalar function with a While loop in it. The "nested REPLACE" code posted on this thread will absolutely fly.

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
telastruct
telastruct
SSC Rookie
SSC Rookie (27 reputation)SSC Rookie (27 reputation)SSC Rookie (27 reputation)SSC Rookie (27 reputation)SSC Rookie (27 reputation)SSC Rookie (27 reputation)SSC Rookie (27 reputation)SSC Rookie (27 reputation)

Group: General Forum Members
Points: 27 Visits: 13
Actually, I did test it and it works great. I tested it on a table with almost 4 million rows and it only took a minute or two.
Paul White
Paul White
SSC-Dedicated
SSC-Dedicated (35K reputation)SSC-Dedicated (35K reputation)SSC-Dedicated (35K reputation)SSC-Dedicated (35K reputation)SSC-Dedicated (35K reputation)SSC-Dedicated (35K reputation)SSC-Dedicated (35K reputation)SSC-Dedicated (35K reputation)

Group: General Forum Members
Points: 35116 Visits: 11359
telastruct (12/30/2010)
Actually, I did test it and it works great. I tested it on a table with almost 4 million rows and it only took a minute or two.

Now try the 'nested REPLACE' code Jeff referred to.
You should find it's much faster (and more efficient).
Actually.



Paul White
SQLPerformance.com
SQLblog.com
@SQL_Kiwi
Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (212K reputation)SSC Guru (212K reputation)SSC Guru (212K reputation)SSC Guru (212K reputation)SSC Guru (212K reputation)SSC Guru (212K reputation)SSC Guru (212K reputation)SSC Guru (212K reputation)

Group: General Forum Members
Points: 212531 Visits: 41977
telastruct (12/30/2010)
Actually, I did test it and it works great. I tested it on a table with almost 4 million rows and it only took a minute or two.


As Paul said, test the nested-REPLACE method... you'll find that 4 million rows are process in just several seconds... ;-) That's provided that you're not outputing the 4 million rows to the display which is the "great equalizer". Lot's of people think their code runs well when they compared it to other code only by how long it takes to render on the screen. Bad code and good code will frequently take the same time on-screen simply because they have the same number of rows to display.

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
telastruct
telastruct
SSC Rookie
SSC Rookie (27 reputation)SSC Rookie (27 reputation)SSC Rookie (27 reputation)SSC Rookie (27 reputation)SSC Rookie (27 reputation)SSC Rookie (27 reputation)SSC Rookie (27 reputation)SSC Rookie (27 reputation)

Group: General Forum Members
Points: 27 Visits: 13
Thanks, guys. I apologize if my "actually" seemed snitty- I see how it could be taken that way after re-reading it. I guess I should have followed it up with the question: If the replace method "can clean the 13 million rows in about 25 minutes" (about 2 minutes for each million rows), how is that faster than 2 minutes for about 4 million rows? I'm not a DB expert, and am sure there's probably some other factor I'm missing...?

Even if it is not as efficient, I do like the way the other solution is packaged- it seems like the replace method would be pretty easy to incorporate.

Thank you again for your thoughts.
Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (212K reputation)SSC Guru (212K reputation)SSC Guru (212K reputation)SSC Guru (212K reputation)SSC Guru (212K reputation)SSC Guru (212K reputation)SSC Guru (212K reputation)SSC Guru (212K reputation)

Group: General Forum Members
Points: 212531 Visits: 41977
telastruct (12/30/2010)
Thanks, guys. I apologize if my "actually" seemed snitty- I see how it could be taken that way after re-reading it. I guess I should have followed it up with the question: If the replace method "can clean the 13 million rows in about 25 minutes" (about 2 minutes for each million rows), how is that faster than 2 minutes for about 4 million rows? I'm not a DB expert, and am sure there's probably some other factor I'm missing...?

Even if it is not as efficient, I do like the way the other solution is packaged- it seems like the replace method would be pretty easy to incorporate.

Thank you again for your thoughts.


Understood and thanks for the feedback.

The real key is the two different rowcounts on 2 different tables on 2 different machines by two different people using two different methods that haven't been fully posted. For example, are they doing an UPDATE, piping to a new table, or displaying on the screen? What's the status of fragmentation for both tables? How many indexes are being updated? Are the systems using the same version? Are the systems using the same capacity pipe and hard drives? What is the wait-state of the two systems? Are they both "dead quiet" or is one supporting a world wide heavy hit Web site with lots of reporting going on?

There's something else that folks need to understand... each system has a "tipping point" where UPDATEs are concerned. For example... on one system it may take only 10 seconds to update a million rows and, likewise, only 20 seconds to update 2 million rows and only 30 seconds to update 3 million rows. But, on that same system with the same table and index structures using the exact same query, it may suddenly take 4 hours to update only 4 million rows. Everything else being the same, another system may be able to handle 6 million rows without reaching the "tipping point".

If you want to do a comparison, it has to be on the same table and the same data or at least data with the same level of randomization. You just can't tell what's what with all the other variables currenlty in play.

I'll also admit that I've seen certain memory only scalar functions beat other methods even when they have a WHILE loop in it. But, we won't actually know until at least one person actually tests both methods on the same data on the same machine. I was hoping you'd be that person so I didn't actually have to be "the one" yet again. :-P

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
mkucera
mkucera
Forum Newbie
Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)

Group: General Forum Members
Points: 3 Visits: 7
This is exactly what i was looking for. My data had ASCII(160) appended to it. I just added that to the list and it worked very well.

-MK
ryan.blosser
ryan.blosser
Forum Newbie
Forum Newbie (8 reputation)Forum Newbie (8 reputation)Forum Newbie (8 reputation)Forum Newbie (8 reputation)Forum Newbie (8 reputation)Forum Newbie (8 reputation)Forum Newbie (8 reputation)Forum Newbie (8 reputation)

Group: General Forum Members
Points: 8 Visits: 28
This is what I have been looking for and forgive me if this is a simple question (I am trying to learn some of these concepts, like in-line table-valued functions), but how do you easily apply this to an entire table? How do you use this, even? It returns a table (not permanent), so I selected the result set into a new table, but I don't know if that defeats the point of using the TVF in the first place.

This:

select * into dbo.testtable from
(SELECT t1.id as id, t1.title as title, clean.cleaned as cleaned from t1
cross apply dbo.if_cleanwithreplace(t1.title) clean) a

fixes one column
Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (212K reputation)SSC Guru (212K reputation)SSC Guru (212K reputation)SSC Guru (212K reputation)SSC Guru (212K reputation)SSC Guru (212K reputation)SSC Guru (212K reputation)SSC Guru (212K reputation)

Group: General Forum Members
Points: 212531 Visits: 41977
ryan.blosser (5/18/2013)
This is what I have been looking for and forgive me if this is a simple question (I am trying to learn some of these concepts, like in-line table-valued functions), but how do you easily apply this to an entire table? How do you use this, even? It returns a table (not permanent), so I selected the result set into a new table, but I don't know if that defeats the point of using the TVF in the first place.

This:

select * into dbo.testtable from
(SELECT t1.id as id, t1.title as title, clean.cleaned as cleaned from t1
cross apply dbo.if_cleanwithreplace(t1.title) clean) a

fixes one column


You can simplify that query quite a bit. No need for the nesting.
SELECT t1.ID, t1.Title, clean.Cleaned
INTO dbo.TestTable
FROM t1
CROSS APPLY dbo.if_cleanwithreplace(t1.Title) clean
;


The purpose of the iTVF, in this case, is to function more like an iSF (Inline Scalar Function) which is typically about 7 times faster than a run-of-the-mill scalar function. See the following article for more on that.
http://www.sqlservercentral.com/articles/T-SQL/91724/

Putting the results in a table doesn't negate the value of such functions. It depends on what you want to do. The results could just as easily be returned directly.

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Tom_Hogan
Tom_Hogan
Ten Centuries
Ten Centuries (1K reputation)Ten Centuries (1K reputation)Ten Centuries (1K reputation)Ten Centuries (1K reputation)Ten Centuries (1K reputation)Ten Centuries (1K reputation)Ten Centuries (1K reputation)Ten Centuries (1K reputation)

Group: General Forum Members
Points: 1009 Visits: 2588
Interesting solution.

How would you handle multiple columns to clean? Doing multiple cross applies seems like it would kill performance as you could be theoretically joining back to the same 4 million record table 2, 3, or more times.

Thanks.
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search