Different Behaviour of VARCHAR and NVARCHAR

  • Hi,

    I need a small clarification. Please see the below code.

    DECLARE @var1 NVARCHAR(100)

    DECLARE @var2 VARCHAR(100)

    SELECT @var1 ='th'

    SELECT @var2='th'

    print 'nv'+@var1 ----- Outputs: nvth

    print 'nv'+replace(@var1,char(222),'a') ----- Outputs: nva

    print 'v'+@var2----- Outputs: vth

    print 'v'+replace(@var2,char(222),'b')----- Outputs: vth

    Here the variables @var1 is declared as NVARCHAR and @var2 is declared as VARCHAR.

    The second print statement is printing nva replacing the th with a.

    But the fourth print statement is printing vth.

    It is not replacing the th with b.

    Please can anyone explain why this behaviour is different.

  • Try this:

    print 'nv'+replace(@var1,nchar(222),'a') ----- Outputs: nva

    print 'v'+replace(@var2,nchar(222),'b')----- Outputs: vb

    You are attempting a double-byte character replacement, so it only makes sense with nchar arguments. If either of the arguments is nchar, the other will be converted and the replacement will work. If both arguments are char, it won't work.

  • nchar(222) or char(222) is Þ.

    How this is replacing th. Moreover th is two characters and above one is just only one character.

  • I'm actually not seeing that behavior. In both cases the 'th' is being replaced. I'm not sure why it is working at all. According to the ASCII table I am looking at 222 is a Right Half Block character.

  • CHAR(222) will be interpreted differently depending on the code page. If you are using code page 1252 (US English), it is Þ. It would only be an ASCII right half block in an old MSDOS code page (839 I think).

    NCHAR(222), whatever it is, is Unicode and should be the same character everywhere. That's the whole point of Unicode. I don't have much experience with Unicode, I haven't seen anything like the replacement originally posted before, and I have no idea where double-byte character equivalents are documented. After trying the sample code that was posted however, and seeing that a Unicode "th" was indeed recognized as CHAR(222), it didn't take much imagination to conclude that this only works when one or both of the arguments are Unicode.

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply