|
|
|
SSC Rookie
      
Group: General Forum Members
Last Login: Monday, February 18, 2013 2:25 PM
Points: 32,
Visits: 87
|
|
Magoo,
I think you're right about how there are embedded NULLS between each character. I ran the series of convert functions you provided and it showed AHSGuest††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††.
I really appreciate you providing this. I might use a modified version of this in a stored procedure as I create a programmatic method of cleaning all the columns of all our tables.
I'll update this thread with that stored procedure.
mtf
|
|
|
|
|
SSC Rookie
      
Group: General Forum Members
Last Login: Monday, February 18, 2013 2:25 PM
Points: 32,
Visits: 87
|
|
Lowell,
That was just the ticket!
The select StripNonAlphaNumeric(name) produces exactly the ASCII without the UNICODE NULLs embedded. The update also seems to work in the testing I've done so far!!!
For applying it to the entire table, I wanted to limit it to just the affected rows, so I use the earlier nested selects to narrow it down:
update my_user_table set name=dbo.StripNonAlphaNumeric(name) WHERE ID in (select DISTINCT B.id FROM (select id, name, N as Position, SUBSTRING(name,N,1) As TheChar, ASCII(SUBSTRING(name,N,1)) TheAsciiCode from my_user_table CROSS APPLY (SELECT TOP 255 ROW_NUMBER() OVER (ORDER BY (SELECT NULL) -1) AS n FROM sys.columns) MiniTally WHERE MiniTally.n BETWEEN 0 AND 255 AND MiniTally.n < LEN(name)+1) B WHERE B.TheAsciiCode=0) We have these UNICODE NULLS infecting many of the columns in many of our tables. I'm going to write a stored procedure that utilizes this function to programmatically clean every column in every table. I'll update this thread with my solution soon.
For now, though, I wanted to let you know how much I appreciate the people who have helped me in this thread. I am HUGELY appreciative. Thank you so much.
mtf
|
|
|
|
|
SSChampion
        
Group: General Forum Members
Last Login: Today @ 3:33 PM
Points: 11,648,
Visits: 27,768
|
|
mrTexasFreedom I'm glad my code helped a bit, but mister magoo really identified the culprit, i think;
some process imported(and maybe still imports) data that should be nvarchar instead of varchar;
you should try to track down whatever that process is and fix it at the source; otherwise this cleanup-after-the-mess thing is going to be needed every time that other process runs.
Lowell
--There is no spoon, and there's no default ORDER BY in sql server either. Actually, Common Sense is so rare, it should be considered a Superpower. --my son
|
|
|
|
|
SSC Rookie
      
Group: General Forum Members
Last Login: Monday, February 18, 2013 2:25 PM
Points: 32,
Visits: 87
|
|
Thank you for the endorsement on Magoo's convert trick. I am going to adjust it so the final character isn't repeated and if I can get that to work, I'll use it in my stored proc.
The source of those documents has been identified and we're putting processes in place to prevent more corrupt data from being imported. Our developers are also working on a front-end filter to dump anything that isn't standard UTF8. I'm the person tasked with cleaning up the mess that's already been created.
Have a great weekend!
mtf
|
|
|
|
|
Ten Centuries
      
Group: General Forum Members
Last Login: Today @ 5:07 PM
Points: 1,308,
Visits: 3,900
|
|
You should be able to just wrap the existing method in LEFT(....,LEN(name)) to get the correct answer...
but only if the original data was no more than half as long as your column can hold, otherwise you have lost some data...
MM
|
|
|
|
|
SSC Journeyman
      
Group: General Forum Members
Last Login: Tuesday, May 21, 2013 9:12 AM
Points: 95,
Visits: 223
|
|
mrTexasFreedom (1/17/2013)
Lowell, Sadly, this isn't matching any rows: select id, name FROM my_user_table WHERE CHARINDEX(name,CHAR(0),1) > 0 No rows returned. Likewise, the replace operation doesn't touch any of the rows. Ideas? Thank you very much for the help you've provided thus far. I feel like I'm on the verge of clobbering this beast thanks to your assistance! mtf
Its not returning any rows because U cannot use a search expression first followed by a find expression in the syntax of CHARINDEX
if you use CHARINDEX (CHAR(0), name, 1 ), I surely believe you will catch the bad hidden characters.. I know this thread is pretty old. But I just loved the way you all made it so quick learning session.. Thanks to Lowell for sharing this thread to me..
--Pra --------------------------------------------------------------------------------
|
|
|
|
|
SSC Rookie
      
Group: General Forum Members
Last Login: Monday, February 18, 2013 2:25 PM
Points: 32,
Visits: 87
|
|
Thanks for spotting that. I'll try that out in my test DB and see if it is a more efficient method.
Appreciatively,
mtf
|
|
|
|