Click here to monitor SSC
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Finding special characters within character strings


Finding special characters within character strings

Author
Message
michael.f.morin
michael.f.morin
SSC Rookie
SSC Rookie (46 reputation)SSC Rookie (46 reputation)SSC Rookie (46 reputation)SSC Rookie (46 reputation)SSC Rookie (46 reputation)SSC Rookie (46 reputation)SSC Rookie (46 reputation)SSC Rookie (46 reputation)

Group: General Forum Members
Points: 46 Visits: 128
Comments posted to this topic are about the item Finding special characters within character strings
wes_springob
wes_springob
Forum Newbie
Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)

Group: General Forum Members
Points: 1 Visits: 12
You say you run this every morning? Why not just set up a constraint on the column to prevent the problem characters from getting in, in the first place?
halifaxdal
halifaxdal
Ten Centuries
Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)

Group: General Forum Members
Points: 1292 Visits: 1720
interesting summary article, worth to save for later use
Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (55K reputation)SSC Guru (55K reputation)SSC Guru (55K reputation)SSC Guru (55K reputation)SSC Guru (55K reputation)SSC Guru (55K reputation)SSC Guru (55K reputation)SSC Guru (55K reputation)

Group: General Forum Members
Points: 55638 Visits: 40407
wes_springob (8/19/2014)
You say you run this every morning? Why not just set up a constraint on the column to prevent the problem characters from getting in, in the first place?


Heh... because if someone fails to mail me a check because of a typo in a bulk load, there will be hell to pay! :-D For example, if I live at 5678 Main Street and someone accidently had the SHIFT key depressed when they typed the "5" so that the address came out as %678 Main Street (just as an example), I'd want someone working on the correction rather than just rejecting the row of bulk data. Why not do those types of checks in the front end before it makes it to a bulk file? Heh... yeah... people always do those kinds of checks in the front end, don't they? :-P

It just doesn't hurt to do these types of checks in the database, especially where any form of keyboard input was ever involved. It would be nice if they did it in the front end, but there's no guarantees there. These types of checks might also be good for capturing examples of what people have been typing to design new "validation" code for the front end or whatever source the data came from.

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Chris Hurlbut
Chris Hurlbut
SSCommitted
SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)

Group: General Forum Members
Points: 1792 Visits: 540
What gets me most of the time are Carriage Returns, Line Feeds and tabs.
We receive many different file formats from different clients, while we try to standardize feeds as much as possible, these sometimes end up in the data I have to process.
Usually Excel is involved in some way...Analysts love excel but for ETL it makes me cringe.
alexander.oss
alexander.oss
SSC-Enthusiastic
SSC-Enthusiastic (129 reputation)SSC-Enthusiastic (129 reputation)SSC-Enthusiastic (129 reputation)SSC-Enthusiastic (129 reputation)SSC-Enthusiastic (129 reputation)SSC-Enthusiastic (129 reputation)SSC-Enthusiastic (129 reputation)SSC-Enthusiastic (129 reputation)

Group: General Forum Members
Points: 129 Visits: 127
Beware the Unicode character FULLWIDTH APOSTROPHE ' (U+FF07), because we had a problem with it once since it looks like SQL Server will treat it like any other apostrophe, i.e. as a string constant delimiter or apostrophe escape. So if you're building dynamic SQL (which of course you aren't, and of course neither were we) and you're manually looking for characters to escape or otherwise handle before executing your SQL string, you may need to handle it too.
michael.f.morin
michael.f.morin
SSC Rookie
SSC Rookie (46 reputation)SSC Rookie (46 reputation)SSC Rookie (46 reputation)SSC Rookie (46 reputation)SSC Rookie (46 reputation)SSC Rookie (46 reputation)SSC Rookie (46 reputation)SSC Rookie (46 reputation)

Group: General Forum Members
Points: 46 Visits: 128
If you own, or are a member of, the project that developed the system that would be great. Sadly, many times you are administering a system that you cannot change, at least in the short term, or have very little control over at that level.
NBSteve
NBSteve
SSCrazy
SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)

Group: General Forum Members
Points: 2149 Visits: 904
Thanks for the article! One clarification:

Also notice that “Ça va” is missing. If the column is case sensitive, should that not have been captured? Again this is not a course on regular expressions but note: the regular expression does not see the accented characters as special. Meaning (é and e) or (Ä and A) are evaluated as the same, as far as the expression is concerned.


This is incorrect, they are not evaluated the same. When checking ranges in regular expressions, the collation setting is used. When you specify [A-Z], this includes every character alphabetically sorted between 'A' and 'Z' according to the collation. 'Ç' is sorted between 'C' and 'D' alphabetically in this collation, and so is included in [A-Z].


SELECT 'IT WORKS'
WHERE 'C' COLLATE SQL_Latin1_General_CP1_CI_AS LIKE ('%[C-C]%');

SELECT 'IT WORKS'
WHERE 'C' COLLATE SQL_Latin1_General_CP1_CI_AS LIKE ('%[B-C]%');

SELECT 'IT WORKS'
WHERE 'C' COLLATE SQL_Latin1_General_CP1_CI_AS LIKE ('%[C-D]%');

SELECT 'IT WORKS'
WHERE 'Ç' COLLATE SQL_Latin1_General_CP1_CI_AS LIKE ('%[C-C]%');

SELECT 'IT WORKS'
WHERE 'Ç' COLLATE SQL_Latin1_General_CP1_CI_AS LIKE ('%[B-C]%');

SELECT 'IT WORKS'
WHERE 'Ç' COLLATE SQL_Latin1_General_CP1_CI_AS LIKE ('%[C-D]%');


In 1-3, you can see that a plain 'C' matches all 3 ranges. However, in 4-6, the accented 'Ç' character only matches the range [C-D].
tom.w.brannon
tom.w.brannon
Ten Centuries
Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)

Group: General Forum Members
Points: 1371 Visits: 1304
Thanks for the post! I set up a search on all the data in my modest sized database and found that LEN also ignores CRLF characters. I found I had a field that had been loaded with trailing blanks up to the last two characters 0D 0A. The value from LEN varied but DATALEN was always 65. Fixed with LEFT on the LEN.
Robert.Sterbal
Robert.Sterbal
SSC Veteran
SSC Veteran (243 reputation)SSC Veteran (243 reputation)SSC Veteran (243 reputation)SSC Veteran (243 reputation)SSC Veteran (243 reputation)SSC Veteran (243 reputation)SSC Veteran (243 reputation)SSC Veteran (243 reputation)

Group: General Forum Members
Points: 243 Visits: 2000
I'm not sure why we don't check our data as many ways as possible. I have an environment that assumes everything is installed as expected, and a help desk busy with inquiries for when it isn't.

The key is to develop testing that is efficient to run.
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search