trailing spaces not honored in a select

  • Maybe I am missing something, but I do not understand the result set from the following code.

    Can someone explain this to me?

    DECLARE @tbl TABLE

    (

    testdata VARCHAR(1024)

    )

    INSERT INTO @tbl ( testdata )

    VALUES ( 'a')--no spaces

    INSERT INTO @tbl ( testdata )

    VALUES ( 'a ')--1 space

    INSERT INTO @tbl ( testdata )

    VALUES ( 'a ')--2 spaces

    INSERT INTO @tbl ( testdata )

    VALUES ( 'a ')--3 spaces

    SELECT * FROM @tbl t WHERE t.testdata='a'--no spaces(actually it does not matter how many spaces i put after the 'a)

    returns 4 rows.

    Steve

  • Read about ANSI_PADDING. You can insert and store trailing spaces if you really wish to do so. Note, it will not affect the SQL string comparison rules (eg. WHERE and HAVING clauses).

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • it will not affect the SQL string comparison rules (eg. WHERE and HAVING clauses).

    Why are trailing spaces considered insignificant in relation to comparison rules?

    And I guess more important that that, Is this behavior generally accepted or is there configuration to control it?

    Steve

  • You can insert and store trailing spaces if you really wish to do so.

    it appears to be storing the spaces

    when i change the select statement to the following

    SELECT '>' + t.testdata + '<' FROM @tbl t WHERE t.testdata='a'

    I get this result set

    >a< --no spaces

    >a < --one space

    >a < --two spaces

    >a < --three spaces

    Steve

  • Edited.

    Steve, when I ran your query I saw all columns with spaces having the same length when you wrapped them in ><. Without the wrapper, the LEN function always returns a length of 1 for the various combinations being stored, although BOL says it should include trailing spaces in the count.

    When I doubled the number of spaces in the constant text being inserted, the '>column<' lengths were different, regardless of the ansi_padding setting. I can't resolve the results with what I'm reading in BOL. If y'all haven't resolved this by this evening, I will spend some serious time researching it.

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • The Dixie Flatline (6/10/2010)


    Despite the BOL language that suggests ANSI_PADDING can allow you to store trailing blanks in a column with VARCHAR datatype, it appears that the actual behavior is different.

    I think the results of the queries below may surprise you. Looking at the LEN() value, the trailing blanks are not being stored at all, regardless of the ANSI_PADDING setting.

    Well that is becasue the LEN function, like the comparisons, ignores trailing spaces. The spaces are still there. DATALENGTH will count trailing spaced though:

    DECLARE @Foo VARCHAR(100)

    SET @Foo = 'a '

    SELECT LEN(@foo), DATALENGTH(@Foo)

  • Maybe this will help?

    http://support.microsoft.com/kb/316626

    EDIT: Forgot to markup the URL.

  • Lamprey,

    Thanks for the cite. I knew there had to be something out there on point. 🙂

    I was about to challenge your statement about the LEN() but stopped to think about it and realized I agreed with you.

    I'm tossing this in because it may save someone else some confusion in the future:

    When I used the help index from SSMS to go to the LEN() function, it showed me only the LEN() for SSIS, which (quite annoyingly) behaves differently from LEN() for T-SQL. To get to the LEN() function for T-SQL, I had to change the filtering to show only SQL Database engine results. This was a wake-up call for me.

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • GREAT ARTICLE.

    It explains a lot, however, it seems to indicate that the LIKE operator would allow you to find the non padded a.

    When I tested it

    SELECT DATALENGTH(t.testdata) actualLength,t.testdata FROM @tbl t WHERE t.testdata LIKE 'a'

    SELECT DATALENGTH(t.testdata) actualLength,t.testdata FROM @tbl t WHERE Patindex ('a',t.testdata) != 0

    SELECT DATALENGTH(t.testdata) actualLength,t.testdata FROM @tbl t WHERE t.testdata='a'

    All three of these select statements return 4 rows of data.

    Steve

  • Like is a bit funky when it comes to trailing spaces. You need to switch things around for a proper comparison:SELECT testdata

    FROM @tbl

    WHERE 'a' LIKE testdata

  • Thanks to all

    So the summarized information is here

    DECLARE @tbl TABLE

    (

    testdata VARCHAR(1024)

    )

    INSERT INTO @tbl ( testdata )

    VALUES ( 'a') --0 spaces

    INSERT INTO @tbl ( testdata )

    VALUES ( 'a ') --1 space

    INSERT INTO @tbl ( testdata )

    VALUES ( 'a ') --2 spaces

    INSERT INTO @tbl ( testdata )

    VALUES ( 'a ') --3 spaces

    Because of Ansi-92 specification the following selects return all 4 rows

    SELECT DATALENGTH(t.testdata) actualLength,t.testdata FROM @tbl t WHERE t.testdata='a'

    SELECT DATALENGTH(t.testdata) actualLength,t.testdata FROM @tbl t WHERE t.testdata LIKE 'a'

    Because of the curious behavior of LIKE and PATINDEX the following selects return the 1 desired row.

    SELECT DATALENGTH(t.testdata) actualLength,t.testdata FROM @tbl t WHERE 'a' LIKE t.testdata

    SELECT DATALENGTH(t.testdata) actualLength,t.testdata FROM @tbl t WHERE Patindex (t.testdata,'a') != 0

    If you want to know the actual Length of data, DATALENGTH() is your friend. Throw Len() away.

    Steve

  • I wouldn't say there is anything curious about LIKE's behavior.

    It allows for the building of different kinds of strings for a variety of "like" comparisons. In the absence of a wildcard in the string, it is basically an equality test. To get all strings starting with "A", the standard string for comparison is 'A%', not 'A'.

    Just to be a devil's advocate, I will now ask why trailing spaces need ever be considered significant? Why not just routinely scrub them from your data?

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • It's also worth noting that when using fixed width data types such as char and nchar, any values shorter than the maximum length will be padded with trailing spaces regardless of whether the insert included them:

    CREATE TABLE #test (char_col CHAR(10), varchar_col VARCHAR(10))

    INSERT INTO #test

    SELECT 'testing123', 'testing123' UNION ALL

    SELECT 'test','test'

    SELECT * FROM #test

    WHERE RIGHT(char_col,1)=' '

    It's a common gotcha if using cached lookups in SSIS as by default, this doesn't consider a strings to be equal when there is a mismatch in trailing spaces.

  • I wouldn't say there is anything curious about LIKE's behavior.

    The curious behavior to which I was referring is the subtle nuance between the following statements

    SELECT DATALENGTH(t.testdata) actualLength,t.testdata FROM @tbl t WHERE t.testdata LIKE 'a'

    SELECT DATALENGTH(t.testdata) actualLength,t.testdata FROM @tbl t WHERE 'a' LIKE t.testdata

    It is not intuitive, at least not to me, that the two statements would return different result sets.

    To me that is curious. It may not be to others.

    Steve

  • I will now ask why trailing spaces need ever be considered significant

    It is not significant, unless you have data that came from an outside source, say migrated data.

    If you did not know that trailing spaces have this effect, you would not know to code for it.

    Also, (I have not tested this),I will bet that if I were to put a unique index on the field that the constraint would not be violated.

    In that case If I ran a select on a value that was expected to be unique and I got multiple rows... well you can guess from there.

    Steve

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply