Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««1234»»»

String comparison with trailing spaces Expand / Collapse
Author
Message
Posted Tuesday, May 13, 2008 1:12 PM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Yesterday @ 11:52 PM
Points: 7,179, Visits: 15,769
Jack Corbett (5/13/2008)
GSquared (5/13/2008)
(Your test data doesn't work as written, since you tried to insert the value 1 into your ID field twice, after defining it as the primary key.)

Yes, your tests are correct, and are the expected behavior. Why would they not be?

If you set ANSI_PADDING "off", it will do it the way you seem to expect. ANSI_PADDING "on", does it the way your tests show. "On" is the recommended setting. (See "SET ANSI_PADDING" in Books Online for details.)


In addition to the error G, notes, your test is flawed as why would you use like without a wildcard? If that is what I want I will use "=" and that works fine on a variable length character column.

I have to admit I originally thought trailing spaces were trimmed from variable length columns and they are not which does appear to affect Like, but I rarely, if ever, use Like with a leading wildcard and trailing spaces as the leading wildcard in itself will most likely cause the query to not use an any indexes. It is disconcerting that "=" ignores the trailing spaces and "Like" does not, but I have not encountered this because when I use wildcards it is at the end of the statement not the beginning.


That's an interesting read. I'm curious - I have NOT messed with the ANSI_PADDING setting (as in - ever; I steer clear of screwing with the connection and/or db options), and in every database I check - the setting is OFF. And yet - the new default is ON? hmm....


----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
Post #499924
Posted Tuesday, May 13, 2008 1:28 PM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Yesterday @ 8:26 AM
Points: 10,381, Visits: 13,436
I just checked all my servers (2005 and 2000) and in the model database ANSI PADDINGS is false meaning OFF and I have never messed with the default on a SQL Server installation, but in SSMS in the Query options ANSI PADDINGS is set to ON. I have tried setting it to off and the behavior remains the same. When you insert 'a ' into a variable length column it is inserting the trailing space. The DataLength function will tell you is spaces are stored as it gives the actual storage used. So inserting 'a' and 'a ' would return 1, 2 in varchar and 2, 4 in nvarchar.

This is something I'd like to know more about as it goes against everything I thought I knew about how SQL Server handled trailing spaces, especially since it seems that the ANSI_PADDING option is not being reported correctly as it is OFF in all my db's and the tests I have done today don't jive with that setting.




Jack Corbett

Applications Developer

Don't let the good be the enemy of the best. -- Paul Fleming

Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
How to Post Performance Problems
Crosstabs and Pivots or How to turn rows into columns Part 1
Crosstabs and Pivots or How to turn rows into columns Part 2
Post #499935
Posted Tuesday, May 13, 2008 1:37 PM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Yesterday @ 11:52 PM
Points: 7,179, Visits: 15,769
Jack Corbett (5/13/2008)
I just checked all my servers (2005 and 2000) and in the model database ANSI PADDINGS is false meaning OFF and I have never messed with the default on a SQL Server installation, but in SSMS in the Query options ANSI PADDINGS is set to ON. I have tried setting it to off and the behavior remains the same. When you insert 'a ' into a variable length column it is inserting the trailing space. The DataLength function will tell you is spaces are stored as it gives the actual storage used. So inserting 'a' and 'a ' would return 1, 2 in varchar and 2, 4 in nvarchar.

This is something I'd like to know more about as it goes against everything I thought I knew about how SQL Server handled trailing spaces, especially since it seems that the ANSI_PADDING option is not being reported correctly as it is OFF in all my db's and the tests I have done today don't jive with that setting.


I am not sure how changing the setting AFTER something has been inserted affects things that were inserted prior to the setting (although from the way it's described - I don't think it affect those values)

Hmm. Just curious - what do you get when you run this?

create table #T(k varchar(10) primary key clustered)

insert #T
select 'a' union all
select 'a ' union all
select 'a ' union all
select 'a ' union all
select 'a '

This tells me that ANSI_PADDING really is off in my case (it errors out).


----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
Post #499940
Posted Tuesday, May 13, 2008 1:41 PM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Yesterday @ 9:58 AM
Points: 13,872, Visits: 9,600
Per BOL, the column behavior is set when the column is created. The setting in place when you insert/select/update doesn't affect it. That might be affecting your tests, Jack.

On the subject of Like and wildcards, it isn't whether the wildcard precedes the string or not, it's whether you include spaces in the literal string or not.

Example:

'a ' -- 3 spaces
is like 'a%', and like '%a%', and like '%a'

It's because of how SQL Server compares strings. They are equal if they would be equal without trailing spaces.


- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
Post #499942
Posted Tuesday, May 13, 2008 1:45 PM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Yesterday @ 9:58 AM
Points: 13,872, Visits: 9,600
Matt:

Your test doesn't quite do it:

set ansi_padding on
go
create table #T(k varchar(10) primary key clustered)

insert #T
select 'a' union all
select 'a ' union all
select 'a ' union all
select 'a ' union all
select 'a '

(ANSI_PADDING definitely on.)

Msg 2627, Level 14, State 1, Line 3
Violation of PRIMARY KEY constraint 'PK__#T________________23AA061E'. Cannot insert duplicate key in object 'dbo.#T'.
The statement has been terminated.


Is the result.

For varchar fields, SQL still considers them equal if they would be equal without trailing spaces, regardless of this setting.


- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
Post #499945
Posted Tuesday, May 13, 2008 1:51 PM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Yesterday @ 9:58 AM
Points: 13,872, Visits: 9,600
I just checked my databases and servers, and ANSI padding is False on all of them, but per BOL:

Important:
In a future version of SQL Server ANSI_PADDING will always be ON and any applications that explicitly set the option to OFF will produce an error. Avoid using this feature in new development work, and plan to modify applications that currently use this feature.


I've never messed with the setting, I assumed it was True because of what it says in BOL, but it looks like BOL might be wrong about the default.


- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
Post #499949
Posted Tuesday, May 13, 2008 2:01 PM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Yesterday @ 8:26 AM
Points: 10,381, Visits: 13,436
I error out on this as well, but even if I run SET ANSI_PADDING ON this code errors out as well.

But try this:

SET Ansi_Padding ON

CREATE TABLE #T(k VARCHAR(10))

INSERT #T
SELECT 'a' UNION ALL
SELECT 'a ' UNION ALL
SELECT 'a ' UNION ALL
SELECT 'a ' UNION ALL
SELECT 'a '

SELECT CASE WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting, k, DATALENGTH(k) AS datalength_k, LEN(k) AS len_k FROM #t WHERE k LIKE 'a_'
DROP TABLE #t

GO
SET ansi_padding OFF

CREATE TABLE #T(k VARCHAR(10))

INSERT #T
SELECT 'a' UNION ALL
SELECT 'a ' UNION ALL
SELECT 'a ' UNION ALL
SELECT 'a ' UNION ALL
SELECT 'a '

SELECT CASE WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting, k, DATALENGTH(k) AS datalength_k, LEN(k) AS len_k FROM #t WHERE k LIKE 'a_'
DROP TABLE #t

With ANSI_PADDING ON this eliminates the first row, but with ANSI_PADDING OFF it eliminates all the rows. Also you can see that the DataLength function returns 1-5 with ANSI_PADDING ON and 1 with ANSI_PADDING OFF. Using "=" does not seem to be affected by the setting, but "LIKE" is affected.


GSquared 5/13/08


Per BOL, the column behavior is set when the column is created. The setting in place when you insert/select/update doesn't affect it. That might be affecting your tests, Jack.


Notice that I am creating the temporary table AFTER the SET.

The question is when creating a table with SSMS which defaults to ANSI_PADDING ON is that how the table is created or is it created with the default on the database? I will be testing this in one of my already created DB's now.




Jack Corbett

Applications Developer

Don't let the good be the enemy of the best. -- Paul Fleming

Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
How to Post Performance Problems
Crosstabs and Pivots or How to turn rows into columns Part 1
Crosstabs and Pivots or How to turn rows into columns Part 2
Post #499955
Posted Tuesday, May 13, 2008 2:12 PM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Yesterday @ 11:52 PM
Points: 7,179, Visits: 15,769
GSquared (5/13/2008)
I just checked my databases and servers, and ANSI padding is False on all of them, but per BOL:

Important:
In a future version of SQL Server ANSI_PADDING will always be ON and any applications that explicitly set the option to OFF will produce an error. Avoid using this feature in new development work, and plan to modify applications that currently use this feature.


I've never messed with the setting, I assumed it was True because of what it says in BOL, but it looks like BOL might be wrong about the default.


That ON sets up some REALLY interesting things. Things that go to challenging my recollection of various fundamental math principles. For example - (taking the wayback machine all of the way back to 6th grade as I recall):

If it is true that x=y and a=b, then so is x+a=y+b

Therefore we should be to say...

'a '='a'
and
'b '='b'
then
'a '+'b '='a'+'b'

...

I feel a tear in the space/time continuum....:P


----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
Post #499961
Posted Tuesday, May 13, 2008 2:17 PM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Yesterday @ 9:58 AM
Points: 13,872, Visits: 9,600
Matt Miller (5/13/2008)
GSquared (5/13/2008)
I just checked my databases and servers, and ANSI padding is False on all of them, but per BOL:

Important:
In a future version of SQL Server ANSI_PADDING will always be ON and any applications that explicitly set the option to OFF will produce an error. Avoid using this feature in new development work, and plan to modify applications that currently use this feature.


I've never messed with the setting, I assumed it was True because of what it says in BOL, but it looks like BOL might be wrong about the default.


That ON sets up some REALLY interesting things. Things that go to challenging my recollection of various fundamental math principles. For example - (taking the wayback machine all of the way back to 6th grade as I recall):

If it is true that x=y and a=b, then so is x+a=y+b

Therefore we should be to say...

'a '='a'
and
'b '='b'
then
'a '+'b '='a'+'b'

...

I feel a tear in the space/time continuum....:P


Too bad concatenation isn't math. :)

2+6 = 8
6+2 = 8
so
6+2 = 2+6

but
'2'+'6' = '26'
and
'6'+'2' = '62'
so
'2'+'6' != '6'+'2'

But yeah, it does make for some weird things we have to keep track of in our code that don't necessarily make intuitive sense to everyone. (But then, intuition is subjective, just like common sense.)


- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
Post #499964
Posted Tuesday, May 13, 2008 2:23 PM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Yesterday @ 8:26 AM
Points: 10,381, Visits: 13,436
It looks as though the ANSI_PADDING setting is taken from the connection NOT the database default. I ran the following in 2 databases (AdventureWorks which had ANSI_PADDING ON and Northwind, ANSI_PADDING OFF):

SET Ansi_Padding OFF

CREATE TABLE
test.categories
  
(
  
CategoryID INT PRIMARY KEY CLUSTERED IDENTITY(1,1),
  
CategoryName NVARCHAR(15)
   )

INSERT INTO test.categories
  
(
  
CategoryName
  
)
  
SELECT 'a' UNION ALL
  
SELECT 'a ' UNION ALL
  
SELECT 'a  ' UNION ALL
  
SELECT 'a   ' UNION ALL
  
SELECT 'a    '
  
SELECT CategoryID, CategoryName, LEN(CategoryName) AS len_name, DATALENGTH(CategoryName) AS Datalength_Name FROM test.categories

GO
SET Ansi_Padding ON

CREATE TABLE
test.categories2
  
(
  
CategoryID INT PRIMARY KEY CLUSTERED IDENTITY(1,1),
  
CategoryName NVARCHAR(15)
   )

INSERT INTO test.categories2
  
(
  
CategoryName
  
)
  
SELECT 'a' UNION ALL
  
SELECT 'a ' UNION ALL
  
SELECT 'a  ' UNION ALL
  
SELECT 'a   ' UNION ALL
  
SELECT 'a    'UNION ALL
  
SELECT 'a     '
  
SELECT CategoryID, CategoryName, LEN(CategoryName) AS len_name, DATALENGTH(CategoryName) AS Datalength_Name FROM test.categories2

DROP TABLE test.categories2
DROP TABLE test.categories


And here were the results in both cases:

ANSI PADDING OFF
CategoryID CategoryName len_name Datalength_Name
----------- --------------- ----------- ---------------
1 a 1 2
2 a 1 2
3 a 1 2
4 a 1 2
5 a 1 10

ANSI PADDING ON
CategoryID CategoryName len_name Datalength_Name
----------- --------------- ----------- ---------------
1 a 1 2
2 a 1 4
3 a 1 6
4 a 1 8
5 a 1 10
6 a 1 12

Also not how once you get to 5 characters in both cases the DataLength returns 10 which I did not expect with ANSI PADDING OFF. Further testing reveals that with ANSI PADDING OFF DataLength appears to return the full length of the character string on the final record returned.




Jack Corbett

Applications Developer

Don't let the good be the enemy of the best. -- Paul Fleming

Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
How to Post Performance Problems
Crosstabs and Pivots or How to turn rows into columns Part 1
Crosstabs and Pivots or How to turn rows into columns Part 2
Post #499969
« Prev Topic | Next Topic »

Add to briefcase ««1234»»»

Permissions Expand / Collapse