Click here to monitor SSC
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Identifying ASCII characters in NVARCHAR columns


Identifying ASCII characters in NVARCHAR columns

Author
Message
chrismalone1
chrismalone1
SSC Rookie
SSC Rookie (46 reputation)SSC Rookie (46 reputation)SSC Rookie (46 reputation)SSC Rookie (46 reputation)SSC Rookie (46 reputation)SSC Rookie (46 reputation)SSC Rookie (46 reputation)SSC Rookie (46 reputation)

Group: General Forum Members
Points: 46 Visits: 40
Hello,

I have an issue where I am storing various international characters in nvarchar columns, but need to branch the data at one point of processing so that ASCII characters are run through an additional cleansing process and all non-ASCII characters are set aside.

Is there a way to identify which nvarchar values are within the ASCII range and can be converted to varchar without corruption? Also, the strings may contain a mix of english and international character sets, so the entire string must be checked and not just the first character.

i.e.
Pass:
'Hello', 'abc123'

Fail:
'太平市', 'abc太123'

Any help would be appreciated and let me know if examples would make this more clear.

Regards,
Chris
Steve Jones
Steve Jones
SSC Guru
SSC Guru (51K reputation)SSC Guru (51K reputation)SSC Guru (51K reputation)SSC Guru (51K reputation)SSC Guru (51K reputation)SSC Guru (51K reputation)SSC Guru (51K reputation)SSC Guru (51K reputation)

Group: Administrators
Points: 51828 Visits: 19005
I know CAST will allow you to change it to a VARCHAR, and you'll end up with a variety of ? in the items for non-ASCII characters.

Perhaps you can somehow search for those placeholders and if they aren't there (CHARINDEX=0), you can do your processing?

Follow me on Twitter: @way0utwest
Forum Etiquette: How to post data/code on a forum to get the best help
My Blog: www.voiceofthedba.com
chrismalone1
chrismalone1
SSC Rookie
SSC Rookie (46 reputation)SSC Rookie (46 reputation)SSC Rookie (46 reputation)SSC Rookie (46 reputation)SSC Rookie (46 reputation)SSC Rookie (46 reputation)SSC Rookie (46 reputation)SSC Rookie (46 reputation)

Group: General Forum Members
Points: 46 Visits: 40
Thanks Steve, this may be an option if we cannot figure out a better way around this. I just need to confirm that we are not allowing ? characters to exist in any of the strings I am cleansing and, if so, create some type of internal process for escaping these out so they do not trigger the 'international' flag.

Appreciate the help.
Lowell
Lowell
SSC-Insane
SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)

Group: General Forum Members
Points: 24771 Visits: 39713
proof of concept for Steve's idea;i thought that was kind of neat:


--results:
Status cnv id val id val
------ ------------------------------ ----------- ------- ----------- -------
Pass Hello 1 Hello 1 Hello
Pass abc123 2 abc123 2 abc123
Fail ??? 3 太平市 3 太平市
Fail abc?123 4 abc太123 4 abc太123




the test code:

SELECT 1 as id,N'Hello' As val
INTo #TMP UNION ALL
SELECT 2 as id,N'abc123' UNION ALL
SELECT 3 as id,N'太平市' UNION ALL
SELECT 4 as id,N'abc太123'

SELECT
CASE
WHEN t1.val = CONVERT(varchar,t2.val)
THEN 'Pass'
ELSE 'Fail'
END As Status,
CONVERT(varchar,t2.val) As cnv,*
FROM #TMP t1
LEFT OUTER JOIN #TMP t2
ON t1.Id = t2.ID



Lowell

--
help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

chrismalone1
chrismalone1
SSC Rookie
SSC Rookie (46 reputation)SSC Rookie (46 reputation)SSC Rookie (46 reputation)SSC Rookie (46 reputation)SSC Rookie (46 reputation)SSC Rookie (46 reputation)SSC Rookie (46 reputation)SSC Rookie (46 reputation)

Group: General Forum Members
Points: 46 Visits: 40
Thanks for the code Lowell, I have been playing around with this myself and it looks like it is going to work out well for my problem.

Thanks again both of you for the help!
debrbender
debrbender
Forum Newbie
Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)

Group: General Forum Members
Points: 9 Visits: 19
Also, once failures are found, you could distinguish actual '?' values from Unicode values and get the percentage of Unicode characters with ...

SELECT t.val
, UnicodeStringLen = len(t.val)
, UnicodeNonQmarkChars = len(replace(t.val,'?',''))
, NonQmarkChars = len(replace(convert(varchar, t.val),'?',''))
, UnicodeChars = len(replace(t.val,'?',''))
- len(replace(convert(varchar, t.val),'?',''))
, PercentUnicodeChars =
(LEN(replace(t.val,'?','')) - len(replace(convert(varchar, t.val),'?','')))
* 100.0 / len(t.val)
FROM #TMP t
WHERE t.val <> CONVERT(varchar,t.val) --FAILURES ONLY
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search