Replace doing funny things

Question

Replace doing funny things

Simon Sabin

SSCrazy Eights

Points: 8142
More actions
November 6, 2003 at 7:06 am

#84728

Can anyone explain the different behaviour of replace,
create database a collate SQL_Latin1_General_CP1_CI_AS
go
use a
go
declare @c binary(100)
set @c =cast( 'simon' + replicate (95, 0x00) as varbinary(100))
select @c
select replace(@c ,0x00,0x20)
select cast(@c as varchar(100))
go
use master
go
drop database a
go
create database a collate Latin1_General_CI_AS
go
use a
go
declare @c binary(100)
set @c =cast( 'simon' + replicate (95, 0x00) as varbinary(100))
select @c
select replace(@c ,0x00,0x20)
select cast(@c as varchar(100))
go
use master
go
drop database a

Simon Sabin
SQL Server MVP

http://sqlblogcasts.com/blogs/simons

Viewing 4 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply

fhanlon SSChampion Points: 13131 More actions · Answer 1

The different behavior has to do with the different collation sequences. To see what the collations you defined are run:

SELECT * FROM ::fn_helpcollations ()

WHERE name = 'Latin1_General_CI_AS' OR name = 'SQL_Latin1_General_CP1_CI_AS'

According to the SQL Server Books Online, "All Unicode data uses the same Unicode code page. Collations do not control the code page used for Unicode columns, only attributes such as comparison rules and case sensitivity. " The comparison rules are different so REPLACE works differently, even if the collations you chose are similar like 'Latin1_General_CI_AS' and 'SQL_Latin1_General_CP1_CI_AS'

Francis

Simon Sabin SSCrazy Eights Points: 8142 More actions · Answer 2

That doesn't explain why it has replaced everything with nothing for the Latin1 collation. It must be a bug.

Simon Sabin

Co-author of SQL Server 2000 XML Distilled

http://www.amazon.co.uk/exec/obidos/ASIN/1904347088

Simon Sabin
SQL Server MVP

http://sqlblogcasts.com/blogs/simons

NPeeters SSChampion Points: 12652 More actions · Answer 3

The difference in Collation might explain the difference.

The SQL_Latin1_... collation is a UNICODE collation. The Latin1_General_... is not.

Under the hood, SQL Server is converting everything internally to a double byte string representation, which means the second byte is always 0x00. Replacing that, yields a very strange unicode representation of a string. That probably breaks some stuff...

(Still thinking about this, like you say, it is not completely logical what is happening...)