﻿<?xml version='1.0' encoding='UTF-8'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/"><channel><title>SQLServerCentral / SQL Server 2008 / SQL Server 2008 - General  / How to match a UNICODE NULL character within an nVarchar(128) string? / Latest Posts</title><generator>InstantForum.NET v2.9.0</generator><description>SQLServerCentral</description><link>http://www.sqlservercentral.com/Forums/</link><webMaster>notifications@sqlservercentral.com</webMaster><lastBuildDate>Fri, 24 May 2013 02:10:48 GMT</lastBuildDate><ttl>20</ttl><item><title>RE: How to match a UNICODE NULL character within an nVarchar(128) string?</title><link>http://www.sqlservercentral.com/Forums/Topic1408071-391-1.aspx</link><description>Thanks for spotting that. I'll try that out in my test DB and see if it is a more efficient method.Appreciatively,mtf</description><pubDate>Mon, 18 Feb 2013 13:54:41 GMT</pubDate><dc:creator>mrTexasFreedom</dc:creator></item><item><title>RE: How to match a UNICODE NULL character within an nVarchar(128) string?</title><link>http://www.sqlservercentral.com/Forums/Topic1408071-391-1.aspx</link><description>[quote][b]mrTexasFreedom (1/17/2013)[/b][hr]Lowell,Sadly, this isn't matching any rows:[code="sql"]select id, name FROM my_user_tableWHERE CHARINDEX(name,CHAR(0),1) &amp;gt; 0[/code]No rows returned. Likewise, the replace operation doesn't touch any of the rows. Ideas?[b]Thank you very much for the help you've provided thus far.[/b] I feel like I'm on the verge of clobbering this beast thanks to your assistance!mtf[/quote]Its not returning any rows because U cannot use a search expression first followed by a find expression in the syntax of CHARINDEXif you use CHARINDEX (CHAR(0), name, 1 ), I surely believe you will catch the bad hidden characters.. I know this thread is pretty old. But I just loved the way you all made it so quick learning session.. Thanks to Lowell for sharing this thread to me..</description><pubDate>Mon, 18 Feb 2013 13:01:03 GMT</pubDate><dc:creator>prathibha_aviator</dc:creator></item><item><title>RE: How to match a UNICODE NULL character within an nVarchar(128) string?</title><link>http://www.sqlservercentral.com/Forums/Topic1408071-391-1.aspx</link><description>You should be able to just wrap the existing method in LEFT(....,LEN(name)) to get the correct answer...but only if the original data was no more than half as long as your column can hold, otherwise you have lost some data...</description><pubDate>Fri, 18 Jan 2013 16:34:23 GMT</pubDate><dc:creator>mister.magoo</dc:creator></item><item><title>RE: How to match a UNICODE NULL character within an nVarchar(128) string?</title><link>http://www.sqlservercentral.com/Forums/Topic1408071-391-1.aspx</link><description>Thank you for the endorsement on Magoo's convert trick. I am going to adjust it so the final character isn't repeated and if I can get that to work, I'll use it in my stored proc.The source of those documents has been identified and we're putting processes in place to prevent more corrupt data from being imported. Our developers are also working on a front-end filter to dump anything that isn't standard UTF8. I'm the person tasked with cleaning up the mess that's already been created.Have a great weekend!mtf</description><pubDate>Fri, 18 Jan 2013 15:44:47 GMT</pubDate><dc:creator>mrTexasFreedom</dc:creator></item><item><title>RE: How to match a UNICODE NULL character within an nVarchar(128) string?</title><link>http://www.sqlservercentral.com/Forums/Topic1408071-391-1.aspx</link><description>mrTexasFreedom I'm glad my code helped a bit, but mister magoo really identified the culprit, i think;some process imported(and maybe [b]still [/b]imports) data that should be nvarchar instead of varchar; you should try to track down whatever that process is and fix it at the source; otherwise this cleanup-after-the-mess thing is going to be needed every time that other process runs.</description><pubDate>Fri, 18 Jan 2013 15:22:58 GMT</pubDate><dc:creator>Lowell</dc:creator></item><item><title>RE: How to match a UNICODE NULL character within an nVarchar(128) string?</title><link>http://www.sqlservercentral.com/Forums/Topic1408071-391-1.aspx</link><description>Lowell,[b]That was just the ticket![/b]The select StripNonAlphaNumeric(name) produces exactly the ASCII without the UNICODE NULLs embedded. The update also seems to work in the testing I've done so far!!! For applying it to the entire table, I wanted to limit it to just the affected rows, so I use the earlier nested selects to narrow it down:[code="sql"]update my_user_tableset name=dbo.StripNonAlphaNumeric(name)WHERE ID in		(select DISTINCT B.id FROM 			(select id, name, N as Position,  				SUBSTRING(name,N,1) As TheChar,  				ASCII(SUBSTRING(name,N,1)) TheAsciiCode				from my_user_table				CROSS APPLY (SELECT TOP 255 ROW_NUMBER() OVER (ORDER BY (SELECT NULL) -1) AS n             	FROM   sys.columns) MiniTally				WHERE  MiniTally.n BETWEEN 0 AND 255 				AND MiniTally.n &amp;lt; LEN(name)+1) B		WHERE B.TheAsciiCode=0)[/code]We have these UNICODE NULLS infecting many of the columns in many of our tables. I'm going to write a stored procedure that utilizes this function to programmatically clean every column in every table. I'll update this thread with my solution soon.For now, though, I wanted to let you know how much I appreciate the people who have helped me in this thread. I am HUGELY appreciative. Thank you so much.mtf</description><pubDate>Fri, 18 Jan 2013 15:18:53 GMT</pubDate><dc:creator>mrTexasFreedom</dc:creator></item><item><title>RE: How to match a UNICODE NULL character within an nVarchar(128) string?</title><link>http://www.sqlservercentral.com/Forums/Topic1408071-391-1.aspx</link><description>Magoo,I think you're right about how there are embedded NULLS between each character. I ran the series of convert functions you provided and it showed AHSGuest††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††.I really appreciate you providing this. I might use a modified version of this in a stored procedure as I create a programmatic method of cleaning all the columns of all our tables.I'll update this thread with that stored procedure.mtf</description><pubDate>Fri, 18 Jan 2013 15:11:08 GMT</pubDate><dc:creator>mrTexasFreedom</dc:creator></item><item><title>RE: How to match a UNICODE NULL character within an nVarchar(128) string?</title><link>http://www.sqlservercentral.com/Forums/Topic1408071-391-1.aspx</link><description>[quote][b]mrTexasFreedom (1/17/2013)[/b][hr]Magoo,For the 'AHSGUEST' value, here's what I see:[code="sql"]select id, name, DATALENGTH(name) as 'thedatalengthName'FROM my_user_tableWHERE id='123'123	A	30[/code]mtf[/quote]I suspect that somewhere along the line, a unicode string has been [b]badly[/b] converted, first to non-unicode with every unicode character being converted to two non-unicode characters, like you are seeing - for example, an A followed by an ascii 0, then converted back to unicode leaving you a unicode string that contains the null (zero) character in every other position.So, it is easy to remove every other character from the strings, but the big question will be Have you lost any information in that process? if you remove the zeros and the results are OK, then fine...otherwise you may need to find the source of the data and import it correctly.edit:Try this conversion CONVERT(NVARCHAR(64),CONVERT(VARBINARY(128),CONVERT(CHAR(128),name)))to see if you get something that looks vaguely right...</description><pubDate>Fri, 18 Jan 2013 11:06:22 GMT</pubDate><dc:creator>mister.magoo</dc:creator></item><item><title>RE: How to match a UNICODE NULL character within an nVarchar(128) string?</title><link>http://www.sqlservercentral.com/Forums/Topic1408071-391-1.aspx</link><description>or another version: a scalar function that would remove anything that was not A-Z/numeric:same basic logic, i think:[code]select   id,   name,dbo.StripNonAlphaNumeric(name) As Fixedfrom my_user_table[/code]If that's cleaning up the results correctly, a simple replace should fix it:[code]UPDATE my_user_tableSET  name = StripNonAlphaNumeric(name)[/code]and the function definition:[code]CREATE FUNCTION StripNonAlphaNumeric(@OriginalText VARCHAR(8000))RETURNS VARCHAR(8000)  BEGIN DECLARE @CleanedText VARCHAR(8000) ;WITH tally (N) as(SELECT TOP 10000 row_number() OVER (ORDER BY sc1.id) FROM Master.dbo.SysColumns sc1 CROSS JOIN Master.dbo.SysColumns sc2)SELECT @CleanedText = ISNULL(@CleanedText,'') +  CASE   --ascii numbers are 48(for '0') thru 57 (for '9')  WHEN ASCII(SUBSTRING(@OriginalText,Tally.N,1)) BETWEEN 48 AND  57    THEN SUBSTRING(@OriginalText,Tally.N,1)   --ascii upper case letters A-Z is 65 thru 90  WHEN ASCII(SUBSTRING(@OriginalText,Tally.N,1)) BETWEEN 65 AND  90    THEN SUBSTRING(@OriginalText,Tally.N,1)   --ascii lower case letters a-z is 97 thru 122  WHEN ASCII(SUBSTRING(@OriginalText,Tally.N,1)) BETWEEN 97 AND  122    THEN SUBSTRING(@OriginalText,Tally.N,1)   ELSE '' END      FROM tally           WHERE Tally.N &amp;lt;= LEN(@OriginalText)                            RETURN @CleanedText END[/code]</description><pubDate>Fri, 18 Jan 2013 10:54:11 GMT</pubDate><dc:creator>Lowell</dc:creator></item><item><title>RE: How to match a UNICODE NULL character within an nVarchar(128) string?</title><link>http://www.sqlservercentral.com/Forums/Topic1408071-391-1.aspx</link><description>[quote][b]mrTexasFreedom (1/17/2013)[/b][hr]Lowell,Sadly, this isn't matching any rows:[code="sql"]select id, name FROM my_user_tableWHERE CHARINDEX(name,CHAR(0),1) &amp;gt; 0[/code]No rows returned. Likewise, the replace operation doesn't touch any of the rows. Ideas?[b]Thank you very much for the help you've provided thus far.[/b] I feel like I'm on the verge of clobbering this beast thanks to your assistance!mtf[/quote]What do you get if you replace the ASCII function with the UNICODE function in Lowell's original diagnostic code (the one that splits the character string into single characters and returns the character code)? If I understood correctly that your troublesome column is NVARCHAR, the UNICODE function should give you the character code to weed out, and if it's a non-printing control character, it may be different than ASCII 0.</description><pubDate>Fri, 18 Jan 2013 09:45:00 GMT</pubDate><dc:creator>wolfkillj</dc:creator></item><item><title>RE: How to match a UNICODE NULL character within an nVarchar(128) string?</title><link>http://www.sqlservercentral.com/Forums/Topic1408071-391-1.aspx</link><description>Magoo,For the 'AHSGUEST' value, here's what I see:[code="sql"]select id, name, DATALENGTH(name) as 'thedatalengthName'FROM my_user_tableWHERE id='123'123	A	30[/code]mtf</description><pubDate>Thu, 17 Jan 2013 13:12:39 GMT</pubDate><dc:creator>mrTexasFreedom</dc:creator></item><item><title>RE: How to match a UNICODE NULL character within an nVarchar(128) string?</title><link>http://www.sqlservercentral.com/Forums/Topic1408071-391-1.aspx</link><description>Lowell,Sadly, this isn't matching any rows:[code="sql"]select id, name FROM my_user_tableWHERE CHARINDEX(name,CHAR(0),1) &amp;gt; 0[/code]No rows returned. Likewise, the replace operation doesn't touch any of the rows. Ideas?[b]Thank you very much for the help you've provided thus far.[/b] I feel like I'm on the verge of clobbering this beast thanks to your assistance!mtf</description><pubDate>Thu, 17 Jan 2013 13:09:11 GMT</pubDate><dc:creator>mrTexasFreedom</dc:creator></item><item><title>RE: How to match a UNICODE NULL character within an nVarchar(128) string?</title><link>http://www.sqlservercentral.com/Forums/Topic1408071-391-1.aspx</link><description>Just wondering what is the datalength() of one of those names, such as AHSGuest  ?</description><pubDate>Thu, 17 Jan 2013 06:19:58 GMT</pubDate><dc:creator>mister.magoo</dc:creator></item><item><title>RE: How to match a UNICODE NULL character within an nVarchar(128) string?</title><link>http://www.sqlservercentral.com/Forums/Topic1408071-391-1.aspx</link><description>well, so far it looks like a single offending character, so i think you can do a simple replace();if this produces the desired name , i would update it with the sample below:[code]select   id,   name,  REPLACE(name,CHAR(0),'') As Fixedfrom my_user_table--only bad data for clarityWHERE CHARINDEX(name,CHAR(0),1) &amp;gt; 0[/code]If that's cleaning up the results correctly, a simple replace should fix it:[code]UPDATE my_user_tableSET  name = REPLACE(name,CHAR(0),'')--only bad data for to match the #rows in our original selectWHERE CHARINDEX(name,CHAR(0),1) &amp;gt; 0[/code]</description><pubDate>Thu, 17 Jan 2013 05:30:01 GMT</pubDate><dc:creator>Lowell</dc:creator></item><item><title>RE: How to match a UNICODE NULL character within an nVarchar(128) string?</title><link>http://www.sqlservercentral.com/Forums/Topic1408071-391-1.aspx</link><description>[b]Hey, that's a pretty cool piece of SQL![/b] Thanks for offering it.I think the +1 helps it traverse the string, but honestly, this is using some structures I've never even seen before (CROSS APPLY -- I need to study up on that!), so I just took a guess on the +1....[code="sql"]select   name,  N as Position,  SUBSTRING(name,N,1) As TheChar,  ASCII(SUBSTRING(name,N,1)) TheAsciiCodefrom my_user_tableCROSS APPLY (SELECT TOP 255 ROW_NUMBER() OVER (ORDER BY (SELECT NULL) -1) AS n             FROM   sys.columns) MiniTallyWHERE  MiniTally.n BETWEEN 0 AND 255 AND MiniTally.n &amp;lt; LEN(name) +1ORDER BY name,N[/code][b]The Results:[/b][code="sql"]A	1	A	65A	2		0A	3	H	72A	4		0A	5	S	83A	6		0A	7	G	71A	8		0A	9	u	117A	10		0A	11	e	101A	12		0A	13	s	115A	14		0A	15	t	116[/code]I think the ascii 0 value is where the UNICODE NULLS are. The above example should be 'AHSGuest', but in the table, it just looks like 'A' with nothing else after the letter A.Here's an ugly recursion using the SQL you provided to explicitly display the names contaminated with the UNICODE NULLS:[code="sql"]select id, name, N as Position,  SUBSTRING(name,N,1) As TheChar,  ASCII(SUBSTRING(name,N,1)) TheAsciiCodefrom my_user_tableCROSS APPLY (SELECT TOP 255 ROW_NUMBER() OVER (ORDER BY (SELECT NULL) -1) AS n             FROM   sys.columns) MiniTallyWHERE  MiniTally.n BETWEEN 0 AND 255 AND MiniTally.n &amp;lt; LEN(name)+1AND ID in		(select DISTINCT B.id FROM 			(select id, name, N as Position,  				SUBSTRING(name,N,1) As TheChar,  				ASCII(SUBSTRING(name,N,1)) TheAsciiCode				from my_user_table				CROSS APPLY (SELECT TOP 255 ROW_NUMBER() OVER (ORDER BY (SELECT NULL) -1) AS n             	FROM   sys.columns) MiniTally				WHERE  MiniTally.n BETWEEN 0 AND 255 				AND MiniTally.n &amp;lt; LEN(name)+1) B		WHERE B.TheAsciiCode=0) ORDER BY name,N[/code]Any idea on how I can get rid of these ASCII(0) characters within my strings?Thanks for your help on this!!!mtf</description><pubDate>Thu, 17 Jan 2013 02:14:19 GMT</pubDate><dc:creator>mrTexasFreedom</dc:creator></item><item><title>RE: How to match a UNICODE NULL character within an nVarchar(128) string?</title><link>http://www.sqlservercentral.com/Forums/Topic1408071-391-1.aspx</link><description>something like this can help you identify the actual char;i don't know what unicode null is (is that ASCII code(0)?but this will help figure it out; [code]select   name,  N as Position,  SUBSTRING(name,N,1) As TheChar,  ASCII(SUBSTRING(name,N,1)) TheAsciiCodefrom usersCROSS APPLY (SELECT TOP 255 ROW_NUMBER() OVER (ORDER BY (SELECT NULL) -1) AS n             FROM   sys.columns) MiniTallyWHERE  MiniTally.n BETWEEN 0 AND 255 AND MiniTally.n &amp;lt; LEN(name)ORDER BY name,N[/code]</description><pubDate>Wed, 16 Jan 2013 14:21:46 GMT</pubDate><dc:creator>Lowell</dc:creator></item><item><title>How to match a UNICODE NULL character within an nVarchar(128) string?</title><link>http://www.sqlservercentral.com/Forums/Topic1408071-391-1.aspx</link><description>Our content has been contaminated by UNICODE NULL characters in our SQL Server 2008 database that don't play nicely with our java libraries that depend on UTF8 characters.I am trying to compose SQL statements that will identify the characters within the nVarchars and then replace with an empty value ''. The main problem I'm having is creating a check to see whether a particular character is a UNICODE NULL or not.When I try to view the values in the database, they look like 'u' instead of 'username'. I suspect there is an invisible UNICODE NULL just after the first letter. My goal is to simply delete that second, invisible character (UNICODE NULL).[quote]SELECT name FROM usersWHERE name LIKE '%[^ -~]%' COLLATE Latin1_General_BIN[/quote][b]Returns the following:[/b][quote]BMMNSSSSÿþA[/quote][quote]select name, CAST(RIGHT(name,1) AS varbinary(128)) AS RIGHTER_1,from userswhere id=1[/quote][b]returns:[/b][quote]B   0x4200[/quote]Any ideas are greatly appreciated.mtf</description><pubDate>Wed, 16 Jan 2013 13:39:57 GMT</pubDate><dc:creator>mrTexasFreedom</dc:creator></item></channel></rss>