SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


How to match a UNICODE NULL character within an nVarchar(128) string?


How to match a UNICODE NULL character within an nVarchar(128) string?

Author
Message
mrTexasFreedom
mrTexasFreedom
Mr or Mrs. 500
Mr or Mrs. 500 (544 reputation)Mr or Mrs. 500 (544 reputation)Mr or Mrs. 500 (544 reputation)Mr or Mrs. 500 (544 reputation)Mr or Mrs. 500 (544 reputation)Mr or Mrs. 500 (544 reputation)Mr or Mrs. 500 (544 reputation)Mr or Mrs. 500 (544 reputation)

Group: General Forum Members
Points: 544 Visits: 87
Magoo,

I think you're right about how there are embedded NULLS between each character. I ran the series of convert functions you provided and it showed AHSGuest††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††.

I really appreciate you providing this. I might use a modified version of this in a stored procedure as I create a programmatic method of cleaning all the columns of all our tables.

I'll update this thread with that stored procedure.

mtf
mrTexasFreedom
mrTexasFreedom
Mr or Mrs. 500
Mr or Mrs. 500 (544 reputation)Mr or Mrs. 500 (544 reputation)Mr or Mrs. 500 (544 reputation)Mr or Mrs. 500 (544 reputation)Mr or Mrs. 500 (544 reputation)Mr or Mrs. 500 (544 reputation)Mr or Mrs. 500 (544 reputation)Mr or Mrs. 500 (544 reputation)

Group: General Forum Members
Points: 544 Visits: 87
Lowell,

That was just the ticket!

The select StripNonAlphaNumeric(name) produces exactly the ASCII without the UNICODE NULLs embedded. The update also seems to work in the testing I've done so far!!!

For applying it to the entire table, I wanted to limit it to just the affected rows, so I use the earlier nested selects to narrow it down:

update my_user_table
set name=dbo.StripNonAlphaNumeric(name)
WHERE ID in
(select DISTINCT B.id FROM
(select id, name, N as Position,
SUBSTRING(name,N,1) As TheChar,
ASCII(SUBSTRING(name,N,1)) TheAsciiCode
from my_user_table
CROSS APPLY (SELECT TOP 255 ROW_NUMBER() OVER (ORDER BY (SELECT NULL) -1) AS n
FROM sys.columns) MiniTally
WHERE MiniTally.n BETWEEN 0 AND 255
AND MiniTally.n < LEN(name)+1) B
WHERE B.TheAsciiCode=0)



We have these UNICODE NULLS infecting many of the columns in many of our tables. I'm going to write a stored procedure that utilizes this function to programmatically clean every column in every table. I'll update this thread with my solution soon.

For now, though, I wanted to let you know how much I appreciate the people who have helped me in this thread. I am HUGELY appreciative. Thank you so much.

mtf
Lowell
Lowell
SSC Guru
SSC Guru (138K reputation)SSC Guru (138K reputation)SSC Guru (138K reputation)SSC Guru (138K reputation)SSC Guru (138K reputation)SSC Guru (138K reputation)SSC Guru (138K reputation)SSC Guru (138K reputation)

Group: General Forum Members
Points: 138521 Visits: 41519
mrTexasFreedom I'm glad my code helped a bit, but mister magoo really identified the culprit, i think;

some process imported(and maybe still imports) data that should be nvarchar instead of varchar;

you should try to track down whatever that process is and fix it at the source; otherwise this cleanup-after-the-mess thing is going to be needed every time that other process runs.

Lowell
--help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!
mrTexasFreedom
mrTexasFreedom
Mr or Mrs. 500
Mr or Mrs. 500 (544 reputation)Mr or Mrs. 500 (544 reputation)Mr or Mrs. 500 (544 reputation)Mr or Mrs. 500 (544 reputation)Mr or Mrs. 500 (544 reputation)Mr or Mrs. 500 (544 reputation)Mr or Mrs. 500 (544 reputation)Mr or Mrs. 500 (544 reputation)

Group: General Forum Members
Points: 544 Visits: 87
Thank you for the endorsement on Magoo's convert trick. I am going to adjust it so the final character isn't repeated and if I can get that to work, I'll use it in my stored proc.

The source of those documents has been identified and we're putting processes in place to prevent more corrupt data from being imported. Our developers are also working on a front-end filter to dump anything that isn't standard UTF8. I'm the person tasked with cleaning up the mess that's already been created.

Have a great weekend!

mtf
mister.magoo
mister.magoo
SSC-Insane
SSC-Insane (20K reputation)SSC-Insane (20K reputation)SSC-Insane (20K reputation)SSC-Insane (20K reputation)SSC-Insane (20K reputation)SSC-Insane (20K reputation)SSC-Insane (20K reputation)SSC-Insane (20K reputation)

Group: General Forum Members
Points: 20112 Visits: 7928
You should be able to just wrap the existing method in LEFT(....,LEN(name)) to get the correct answer...

but only if the original data was no more than half as long as your column can hold, otherwise you have lost some data...

MM


select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);




  • Forum Etiquette: How to post Reporting Services problems
  • Forum Etiquette: How to post data/code on a forum to get the best help - by Jeff Moden
  • How to Post Performance Problems - by Gail Shaw

  • prathibha_aviator
    prathibha_aviator
    Old Hand
    Old Hand (358 reputation)Old Hand (358 reputation)Old Hand (358 reputation)Old Hand (358 reputation)Old Hand (358 reputation)Old Hand (358 reputation)Old Hand (358 reputation)Old Hand (358 reputation)

    Group: General Forum Members
    Points: 358 Visits: 271
    mrTexasFreedom (1/17/2013)
    Lowell,

    Sadly, this isn't matching any rows:

    select id, name 
    FROM my_user_table
    WHERE CHARINDEX(name,CHAR(0),1) > 0



    No rows returned. Likewise, the replace operation doesn't touch any of the rows. Ideas?

    Thank you very much for the help you've provided thus far. I feel like I'm on the verge of clobbering this beast thanks to your assistance!

    mtf


    Its not returning any rows because U cannot use a search expression first followed by a find expression in the syntax of CHARINDEX

    if you use CHARINDEX (CHAR(0), name, 1 ), I surely believe you will catch the bad hidden characters.. I know this thread is pretty old. But I just loved the way you all made it so quick learning session.. Thanks to Lowell for sharing this thread to me..

    --Pra:-):-)--------------------------------------------------------------------------------
    mrTexasFreedom
    mrTexasFreedom
    Mr or Mrs. 500
    Mr or Mrs. 500 (544 reputation)Mr or Mrs. 500 (544 reputation)Mr or Mrs. 500 (544 reputation)Mr or Mrs. 500 (544 reputation)Mr or Mrs. 500 (544 reputation)Mr or Mrs. 500 (544 reputation)Mr or Mrs. 500 (544 reputation)Mr or Mrs. 500 (544 reputation)

    Group: General Forum Members
    Points: 544 Visits: 87
    Thanks for spotting that. I'll try that out in my test DB and see if it is a more efficient method.

    Appreciatively,

    mtf
    Go


    Permissions

    You can't post new topics.
    You can't post topic replies.
    You can't post new polls.
    You can't post replies to polls.
    You can't edit your own topics.
    You can't delete your own topics.
    You can't edit other topics.
    You can't delete other topics.
    You can't edit your own posts.
    You can't edit other posts.
    You can't delete your own posts.
    You can't delete other posts.
    You can't post events.
    You can't edit your own events.
    You can't edit other events.
    You can't delete your own events.
    You can't delete other events.
    You can't send private messages.
    You can't send emails.
    You can read topics.
    You can't vote in polls.
    You can't upload attachments.
    You can download attachments.
    You can't post HTML code.
    You can't edit HTML code.
    You can't post IFCode.
    You can't post JavaScript.
    You can post emoticons.
    You can't post or upload images.

    Select a forum







































































































































































    SQLServerCentral


    Search