|
|
|
Forum Newbie
      
Group: General Forum Members
Last Login: Monday, May 04, 2009 8:22 AM
Points: 4,
Visits: 12
|
|
I have to remove leading and trailing spaces from a nvarchar field (used as a UNIQUE INDEX) in a database.
Having strange results with trailing spaces, I performed theses tests :
SELECT (CASE 'Chaine1' WHEN 'Chaine1' THEN 1 ELSE 0 END) as CHAINE; SELECT (CASE 'Chaine1' WHEN ' Chaine1' THEN 1 ELSE 0 END) as CHAINE; SELECT (CASE 'Chaine1' WHEN 'Chaine1 ' THEN 1 ELSE 0 END) as CHAINE;
The results are : Query 1 returns 1 (sounds good) Query 2 returns 0 (still good) Query 3 returns 1 
Then i tried a more simple test
SELECT (CASE '' WHEN '' THEN 1 ELSE 0 END) as RESULT; SELECT (CASE '' WHEN ' ' THEN 1 ELSE 0 END) as RESULT; SELECT (CASE ' ' WHEN '' THEN 1 ELSE 0 END) as RESULT;
Am I missing something ?
|
|
|
|
|
Forum Newbie
      
Group: General Forum Members
Last Login: Monday, May 04, 2009 8:22 AM
Points: 4,
Visits: 12
|
|
I found the reason why 'a' is equal to 'a ' or 'a '. ANSI Standard : Have a look to :
http://msdn.microsoft.com/en-us/library/ms187403.aspx
|
|
|
|
|
SSCoach
         
Group: General Forum Members
Last Login: Tuesday, May 21, 2013 1:55 PM
Points: 15,442,
Visits: 9,571
|
|
Have you thought about using LTrim and RTrim for this? Might be easier.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
|
|
|
|
|
Forum Newbie
      
Group: General Forum Members
Last Login: Monday, May 04, 2009 8:22 AM
Points: 4,
Visits: 12
|
|
Yes, but this is not the problem.
The exact problem is that for SQL norm if you compare two string with different lengths the first thing done by SQL is to make them to the same length by adding trailing spaces.
So, if your query compares string1 'a' and string2 'a ', string1 is first converted to 'a ' then compared to string2, and now the two string are the same.
Finally and fortunately, if the field is a UNIQUE INDEX it is not possible to have 'a' and 'a ' in two different rows. If it is not a UNIQUE INDEX field, then you will have to use RTRIM, LTRIM and LEN function with an extra character like LEN('a'+'#')=2 and LEN('a '+'#')=3.
Len('a') and len('a ') give ...1
What you must keep in mind is : Remove trailing spaces before insertion !
|
|
|
|
|
SSChampion
        
Group: General Forum Members
Last Login: Friday, May 17, 2013 12:22 PM
Points: 10,571,
Visits: 11,871
|
|
serge.laot (5/13/2008) Yes, but this is not the problem.
The exact problem is that for SQL norm if you compare two string with different lengths the first thing done by SQL is to make them to the same length by adding trailing spaces.
So, if your query compares string1 'a' and string2 'a ', string1 is first converted to 'a ' then compared to string2, and now the two string are the same.
Finally and fortunately, if the field is a UNIQUE INDEX it is not possible to have 'a' and 'a ' in two different rows. If it is not a UNIQUE INDEX field, then you will have to use RTRIM, LTRIM and LEN function with an extra character like LEN('a'+'#')=2 and LEN('a '+'#')=3.
Len('a') and len('a ') give ...1
What you must keep in mind is : Remove trailing spaces before insertion !
Actually when inserting into variable length character columns SQL Server automatically trims trailing spaces which is why 'a' and 'a ' would create an issue with a unique index as both would be 'a'. When using fixed length character columns SQL Server right pads the string on insertion so in a char(10) column 'a' and 'a ' would both be 'a ', a followed by 9 spaces.
Jack Corbett
Applications Developer Don't let the good be the enemy of the best. -- Paul Fleming
Check out these links on how to get faster and more accurate answers: Forum Etiquette: How to post data/code on a forum to get the best help Need an Answer? Actually, No ... You Need a Question How to Post Performance Problems Crosstabs and Pivots or How to turn rows into columns Part 1 Crosstabs and Pivots or How to turn rows into columns Part 2
|
|
|
|
|
Forum Newbie
      
Group: General Forum Members
Last Login: Monday, May 04, 2009 8:22 AM
Points: 4,
Visits: 12
|
|
I agree with a column type of char(10) as you say. When it is nvarchar(250) on SQL Serveur 2005 if you run the following query :
SELECT LEN(fieldname) FROM table; you will not get 250 for each row but the actual length of the string minus the trailing spaces. So up to this point I agree with you. Inserting 'a' and then 'a ' will create an issue if the column is set as unique index whenever it is a char or nvarchar column, absolutly yes.
But let's take an example. First create a simple table et queries
CREATE TABLE TEST ( [id] [int] NOT NULL , [text] [nvarchar] (50) NOT NULL ) ON [PRIMARY] GO
ALTER TABLE [TEST] WITH NOCHECK ADD CONSTRAINT [PK_TEST] PRIMARY KEY CLUSTERED ( [id] ) ON [PRIMARY] GO
CREATE UNIQUE INDEX [IX_TEST] ON [TEST]([text]) ON [PRIMARY] GO
/* Insert with a trailing space *: INSERT INTO TEST (id, text) VALUES (1, 'a ') GO INSERT INTO TEST (id, text) VALUES (1, 'b') GO /* Then SELECT with like */ SELECT * FROM TEST WHERE text LIKE 'a' /* Returns nothing */ SELECT * FROM TEST WHERE text LIKE 'a ' /* Returns 1, 'a ' */ SELECT * FROM TEST WHERE text LIKE '_ ' /* Returns 1, 'a ' */ SELECT * FROM TEST WHERE text LIKE '% ' /* returns 1,'a ' */ GO
So here, the trailing space remaining during insertion can have some effect according to the way you write your LIKE statement...
Could be confusing sometimes.
|
|
|
|
|
SSCoach
         
Group: General Forum Members
Last Login: Tuesday, May 21, 2013 1:55 PM
Points: 15,442,
Visits: 9,571
|
|
serge.laot (5/13/2008) Yes, but this is not the problem.
Per your original post:
I have to remove leading and trailing spaces from a nvarchar field (used as a UNIQUE INDEX) in a database.
I suggested using LTrim and RTrim, which would do exactly what you asked for. Then you say that's not the problem. I'm not following your train of thought here. Please clarify what you actually want to do, since it's apparently not what you originally asked, or I misunderstood what you are asking for.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
|
|
|
|
|
SSCoach
         
Group: General Forum Members
Last Login: Tuesday, May 21, 2013 1:55 PM
Points: 15,442,
Visits: 9,571
|
|
(Your test data doesn't work as written, since you tried to insert the value 1 into your ID field twice, after defining it as the primary key.)
Yes, your tests are correct, and are the expected behavior. Why would they not be?
If you set ANSI_PADDING "off", it will do it the way you seem to expect. ANSI_PADDING "on", does it the way your tests show. "On" is the recommended setting. (See "SET ANSI_PADDING" in Books Online for details.)
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
|
|
|
|
|
SSCertifiable
       
Group: General Forum Members
Last Login: Today @ 10:31 AM
Points: 6,998,
Visits: 13,950
|
|
serge.laot (5/13/2008)
I agree with a column type of char(10) as you say. When it is nvarchar(250) on SQL Serveur 2005 if you run the following query : SELECT LEN(fieldname) FROM table; you will not get 250 for each row but the actual length of the string minus the trailing spaces. So up to this point I agree with you. Inserting 'a' and then 'a ' will create an issue if the column is set as unique index whenever it is a char or nvarchar column, absolutly yes. But let's take an example. First create a simple table et queries CREATE TABLE TEST ( [id] [int] NOT NULL , [text] [nvarchar] (50) NOT NULL ) ON [PRIMARY] GO
ALTER TABLE [TEST] WITH NOCHECK ADD CONSTRAINT [PK_TEST] PRIMARY KEY CLUSTERED ( [id] ) ON [PRIMARY] GO
CREATE UNIQUE INDEX [IX_TEST] ON [TEST]([text]) ON [PRIMARY] GO
/* Insert with a trailing space *: INSERT INTO TEST (id, text) VALUES (1, 'a ') GO INSERT INTO TEST (id, text) VALUES (1, 'b') GO /* Then SELECT with like */ SELECT * FROM TEST WHERE text LIKE 'a' /* Returns nothing */ SELECT * FROM TEST WHERE text LIKE 'a ' /* Returns 1, 'a ' */ SELECT * FROM TEST WHERE text LIKE '_ ' /* Returns 1, 'a ' */ SELECT * FROM TEST WHERE text LIKE '% ' /* returns 1,'a ' */ GO
So here, the trailing space remaining during insertion can have some effect according to the way you write your LIKE statement... Could be confusing sometimes.
Not quite. Your use of LEN is throwing you off. Per BOL, the definition of LEN is :
Returns the number of characters of the specified string expression, excluding trailing blanks.
So -
len('a') len('a ') len('a ') are all 1, but
len(' a') is 2. Jack's actually right. If you use varchar or nvarchar, then insert values into those columns will TRIM extra spaces (from the right only) out of the value, making 'a' and 'a ' equal. On the other hand, using CHAR or NCHAR would PAD the field up to the requisite length, in essence ALSO making them equal.
---------------------------------------------------------------------------------- Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
|
|
|
|
|
SSChampion
        
Group: General Forum Members
Last Login: Friday, May 17, 2013 12:22 PM
Points: 10,571,
Visits: 11,871
|
|
GSquared (5/13/2008) (Your test data doesn't work as written, since you tried to insert the value 1 into your ID field twice, after defining it as the primary key.)
Yes, your tests are correct, and are the expected behavior. Why would they not be?
If you set ANSI_PADDING "off", it will do it the way you seem to expect. ANSI_PADDING "on", does it the way your tests show. "On" is the recommended setting. (See "SET ANSI_PADDING" in Books Online for details.)
In addition to the error G, notes, your test is flawed as why would you use like without a wildcard? If that is what I want I will use "=" and that works fine on a variable length character column.
I have to admit I originally thought trailing spaces were trimmed from variable length columns and they are not which does appear to affect Like, but I rarely, if ever, use Like with a leading wildcard and trailing spaces as the leading wildcard in itself will most likely cause the query to not use an any indexes. It is disconcerting that "=" ignores the trailing spaces and "Like" does not, but I have not encountered this because when I use wildcards it is at the end of the statement not the beginning.
Jack Corbett
Applications Developer Don't let the good be the enemy of the best. -- Paul Fleming
Check out these links on how to get faster and more accurate answers: Forum Etiquette: How to post data/code on a forum to get the best help Need an Answer? Actually, No ... You Need a Question How to Post Performance Problems Crosstabs and Pivots or How to turn rows into columns Part 1 Crosstabs and Pivots or How to turn rows into columns Part 2
|
|
|
|