String comparison with trailing spaces

  • It's been our experience (going back more than 20 years) that in this type of application our customers expect enough security to prevent a casual attempt to access the parts, but don't expect that it should be able to withstand a serious attempt at hacking the password. The Windows logins provide a level of security in relation to network resources, etc.

    The password is only one example of the situation, there are several other fields where this has been a problem and it's likely that there are more that we haven't come across yet.

    From our perspective it appears that Microsoft has made a purely arbitrary decision to change the way that SQL Server handles this kind of data. I've not found any reason that explains why this is a good thing. I expect we will have to live with that and make whatever changes in our code it takes to get around the issue, but it would sure be nice to understand the alleged benefit.

  • Jack Corbett (5/13/2008)

    Although note once again the odd behavior of the final row of the ANSI PADDING OFF results. Any explanation for that?

    The clue to this odd-looking behaviour is in the BOL entry for ANSI_PADDING:

    Controls the way the column stores values shorter than the defined size of the column, and the way the column stores values that have trailing blanks in char, varchar, binary, and varbinary data.

    The emphasis of the word 'shorter' is mine. The T-SQL used to construct the sample data (the UNION ALL statements) contains a subtlety: the result of the SELECT...UNION ALL statements is a table, and each column of that table has a type. Where constants or string literals are used, SQL Server applies some poorly-documented rules to determine the implied type of the columns resulting from the UNIONs.

    In this case, the maximum length of the string literals is seven, so the first column (the one with the variable number of spaces after the character 'a') is implicitly typed as VARCHAR(7). This brings me to the importance of the word 'shorter' mentioned earlier. The trimming of spaces is applied to every row of the UNIONed result which is shorter than VARCHAR(7) - that is, all the rows except the last one. That, in a nutshell, is the reason why the results are different for the last row.

    This is a first-class example of the sort of counter-intuitive behaviour that has earned ANSI_PADDING OFF its deprecated status.

    One other thing to note: the weirdness of ANSI_PADDING OFF only applies to CHAR, VARCHAR, and BINARY - not NCHAR or NVARCHAR. The sample data is supplied as VARCHAR (no N prefix to the literals), but the final table destination column is typed as NVARCHAR(15). Sadly, ANSI_PADDING works its evil on the tabular result of the VARCHAR table created by the UNION, before the implicit conversion to NVARCHAR(15). It is just horrible, really.

    Quick sample code to demonstrate my point about the UNIONs (try it with ANSI_PADDING ON and OFF):



    FROM (

    SELECT 'a', 'b' UNION ALL

    SELECT 'a ', 'b' UNION ALL

    SELECT 'a ', 'b' UNION ALL

    SELECT 'a ', 'b' UNION ALL

    SELECT 'a ', 'b' UNION ALL

    SELECT 'a ', 'b' UNION ALL

    SELECT 'a ', 'b'

    ) T (a, b);


    len_a b


    1 b

    1 b

    1 b

    1 b

    1 b

    1 b

    7 b

    If you re-run the original demo code with N prefixes for the UNIONed literals, everything works as expected - because ANSI_PADDING OFF has no effect on Unicode data.

    I realise this thread is two years old or so, but I read back to get the history of the question, and found this discussion which didn't seem to have a fully satisfactory answer.

    Final point: the database settings for things like ANSI_PADDING are all OFF by default. These defaults are only applied if no setting is specified when the connection is made. All clients and drivers I am aware of set ANSI_PADDING ON at connection time.


Viewing 2 posts - 31 through 32 (of 32 total)

You must be logged in to reply to this topic. Login to reply