T-SQL encode issue

  • i use the LEN() function to calculate a string length in sql server 2008 management studio.

    select LEN('ESPRINET IBÉRICA S.L.U.')

    the result : 23

    But actually,the length of the string is 24 characters, because 'É' is a European font and will occupy two place.

    In C#, it will get the right length converting the utf8 code.

    How can i get the right length for this string using T-SQL? Many thanks!:-)

  • changbluesky (8/3/2010)


    i use the LEN() function to calculate a string length in sql server 2008 management studio.

    select LEN('ESPRINET IBÉRICA S.L.U.')

    the result : 23

    But actually,the length of the string is 24 characters, because 'É' is a European font and will occupy two place.

    In C#, it will get the right length converting the utf8 code.

    How can i get the right length for this string using T-SQL? Many thanks!:-)

    I don't know about C# and exactly what you're doing there, but technically, 'É' is a single unicode character (it's the 202nd character in the unicode set.) And frankly, I'm not sure of any basis but one that it could be considered anything other than a single character. That one basis would be the number of octets needed for the storage of the character, which would be twice as much as 'E' or other standard ASCII characters.

    This would be one way to get that information:

    declare @table table

    (column0 int identity

    ,column1 nvarchar(50))

    insert into @table select N'ESPRINET IBÉRICA S.L.U.'

    ;WITH -- inline Tally table

    E1(N) AS (SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1), --10

    E2(N) AS (SELECT 1 FROM E1 a, E1 b), --100

    E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10,000

    cteTally(N) AS (SELECT ROW_NUMBER() OVER (ORDER BY N) FROM E4)

    select tab.column1,

    sq.octets

    from

    (select t.column0,

    octets = SUM(CASE

    WHEN UNICODE(SUBSTRING(t.column1, tally.N, 1)) <= 127 THEN 1

    ELSE 2 END)

    from cteTally tally

    join @table t

    on tally.N <= LEN(t.column1)

    group by t.column0) sq

    join @table tab

    on tab.column0 = sq.column0

    └> bt



    Forum Etiquette: How to post data/code on a forum to get the best help[/url]

  • SQL Server has the DataLength function which returns the bytes needed to store a value. As far as I know SQL Server doesn't need 2 bytes to store the "É" as it will store it in a VARCHAR column and return it just fine. To demonstrate I'll just modify bt's code a little:

    declare @table table

    (column0 int identity

    ,column1 nvarchar(23),

    column2 VARCHAR(23))

    insert into @table select N'ESPRINET IBÉRICA S.L.U.', 'ESPRINET IBÉRICA S.L.U.'

    ;WITH -- inline Tally table

    E1(N) AS (SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1), --10

    E2(N) AS (SELECT 1 FROM E1 a, E1 b), --100

    E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10,000

    cteTally(N) AS (SELECT ROW_NUMBER() OVER (ORDER BY N) FROM E4)

    select tab.column1,

    sq.octets,

    tab.column2

    from

    (select t.column0,

    octets = SUM(CASE

    WHEN UNICODE(SUBSTRING(t.column1, tally.N, 1)) <= 127 THEN 1

    ELSE 2 END)

    from cteTally tally

    join @table t

    on tally.N <= LEN(t.column1)

    group by t.column0) sq

    join @table tab

    on tab.column0 = sq.column0

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply