August 21, 2015 at 3:10 pm
Greetings,
I'm presented with a problem where I have a database table which must be migrated via a "custom tool", moving the data into a new table which has special character requirements that didn't exist in the source database. My data resides in an SQL Server 2008R2 instance.
I envision a one-time query which will loop through selected records and replace the offending characters with --, however I'm having trouble understanding how this works.
There are roughly 2500 records which meet the criteria of "contains bad characters", frequently containing multiple separate bad chars, and the table contains roughly 100000 rows.
Special Characters are defined as #%&*:<>?\/{}|~ and ..
While the field is called "Filename" it isn't always so, it is a parent/child table where foldernames are also stored.
Example data:
Tablename = Items
ItemID Filename ListID
1 Badfile<2015>.docx 15
2 Goodfile.docx 15
3 MoreBad#.docx 15
4 Dog&Cat#17.pdf 15
5 John's "Special" Folder 16
The examples I'm finding are all oriented around SELECT statements, to change the output of what I see returned, however I'd rather just fix the entire column using an UPDATE. Initial testing using REPLACE fails because I don't always have a single character as the bad thing in a string.
In a better solution, I found an example using a User Defined Function to modify the output of a select, but I cannot use that UDF in an UPDATE.
My alternative is to learn enough C# to modify the "migration tool" to do this in-transit, but I know even less about C# than I do of SQL.
I gather I want to use @@ROWCOUNT to loop through the rows but I really can't put it all together in a cohesive way.
August 21, 2015 at 3:21 pm
winndm (8/21/2015)
Greetings,I'm presented with a problem where I have a database table which must be migrated via a "custom tool", moving the data into a new table which has special character requirements that didn't exist in the source database. My data resides in an SQL Server 2008R2 instance.
I envision a one-time query which will loop through selected records and replace the offending characters with --, however I'm having trouble understanding how this works.
There are roughly 2500 records which meet the criteria of "contains bad characters", frequently containing multiple separate bad chars, and the table contains roughly 100000 rows.
Special Characters are defined as
#%&*:<>?\/{}|~ and ..While the field is called "Filename" it isn't always so, it is a parent/child table where foldernames are also stored.
Example data:
Tablename = Items
ItemID Filename ListID
1 Badfile<2015>.docx 15
2 Goodfile.docx 15
3 MoreBad#.docx 15
4 Dog&Cat#17.pdf 15
5 John's "Special" Folder 16
The examples I'm finding are all oriented around SELECT statements, to change the output of what I see returned, however I'd rather just fix the entire column using an UPDATE. Initial testing using REPLACE fails because I don't always have a single character as the bad thing in a string.
In a better solution, I found an example using a User Defined Function to modify the output of a select, but I cannot use that UDF in an UPDATE.
My alternative is to learn enough C# to modify the "migration tool" to do this in-transit, but I know even less about C# than I do of SQL.
I gather I want to use @@ROWCOUNT to loop through the rows but I really can't put it all together in a cohesive way.
Forget everything you know about looping when it comes to sql. The last thing you want to do is loop through every character in every row for 10,000 rows. If you wrote that code you might as well go home for dinner and come back later because it will take an eternity to run. You should also forget about user defined functions because most likely it was a scalar function which is also very horrible for performance.
You instead want to do this using a set based method. It is much faster and it is also a lot simpler to code and fix. Assuming you just want to remove all the offending characters in your list this should take care of the entire table for all characters in one simple update statement.
update Items
set [Filename] =
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE([Filename], '#', '')
, '%', '')
, '&', '')
, '*', '')
, ':', '')
, '&', '')
, '<', '')
, '>', '')
, '?', '')
, '\', '')
, '/', '')
, '{', '')
, '}', '')
, '|', '')
, '~', '')
, '..', '')
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
August 21, 2015 at 3:54 pm
Thank you, this works for the test set I've built!
I see I missed a few cases in my test data and will test it a little harder.
Any chance a doublequote being one of the characters to replace will be cause a fail? I'm hazy on how SQL handles escaping (I thought I'd read that both single and double quotes can be used to escape control characters).
I think my issue with the nested REPLACE examples I'd seen was they were terribly hard to read and my brain just didn't click.
Your example was on-point, thank you again!
August 24, 2015 at 7:09 am
winndm (8/21/2015)
Thank you, this works for the test set I've built!I see I missed a few cases in my test data and will test it a little harder.
Any chance a doublequote being one of the characters to replace will be cause a fail? I'm hazy on how SQL handles escaping (I thought I'd read that both single and double quotes can be used to escape control characters).
I think my issue with the nested REPLACE examples I'd seen was they were terribly hard to read and my brain just didn't click.
Your example was on-point, thank you again!
You shouldn't have any issue with a double quote. Just add another level of nesting, no need to escape it. It is difficult to format that level of nesting and keep it legible. It is however super fast. Glad my example worked for you.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
Viewing 4 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply