Replace doing funny things

  • Can anyone explain the different behaviour of replace,

    create database a collate SQL_Latin1_General_CP1_CI_AS

    go

    use a

    go

    declare @C binary(100)

    set @C =cast( 'simon' + replicate (95, 0x00) as varbinary(100))

    select @C

    select replace(@c ,0x00,0x20)

    select cast(@c as varchar(100))

    go

    use master

    go

    drop database a

    go

    create database a collate Latin1_General_CI_AS

    go

    use a

    go

    declare @C binary(100)

    set @C =cast( 'simon' + replicate (95, 0x00) as varbinary(100))

    select @C

    select replace(@c ,0x00,0x20)

    select cast(@c as varchar(100))

    go

    use master

    go

    drop database a


    Simon Sabin
    SQL Server MVP

    http://sqlblogcasts.com/blogs/simons

  • The different behavior has to do with the different collation sequences. To see what the collations you defined are run:

    SELECT * FROM ::fn_helpcollations ()

    WHERE name = 'Latin1_General_CI_AS' OR name = 'SQL_Latin1_General_CP1_CI_AS'

    According to the SQL Server Books Online, "All Unicode data uses the same Unicode code page. Collations do not control the code page used for Unicode columns, only attributes such as comparison rules and case sensitivity. " The comparison rules are different so REPLACE works differently, even if the collations you chose are similar like 'Latin1_General_CI_AS' and 'SQL_Latin1_General_CP1_CI_AS'

    Francis

  • That doesn't explain why it has replaced everything with nothing for the Latin1 collation. It must be a bug.

    Simon Sabin

    Co-author of SQL Server 2000 XML Distilled

    http://www.amazon.co.uk/exec/obidos/ASIN/1904347088


    Simon Sabin
    SQL Server MVP

    http://sqlblogcasts.com/blogs/simons

  • The difference in Collation might explain the difference.

    The SQL_Latin1_... collation is a UNICODE collation. The Latin1_General_... is not.

    Under the hood, SQL Server is converting everything internally to a double byte string representation, which means the second byte is always 0x00. Replacing that, yields a very strange unicode representation of a string. That probably breaks some stuff...

    (Still thinking about this, like you say, it is not completely logical what is happening...)

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply