• L' Eomot Inversé (7/25/2012)


    Hugo Kornelis (7/11/2012)


    In my opinion, it is LEN() that has the strange behaviour here, because it returns the length of the padded version of the string instead of the length of the string itself. This almost caused me to give the wrong answer - but I recalled just in time the number of times I have already been bitten by this strange behaviour of LEN().

    I don't understand that, Hugo. LEN isn't doing anything with padding except ignoring it. It's the concatenate operation that turns the padding from padding to not padding by putting non-space characters beyond the padding.

    Sorry for the late reply; I was on holiday. (FWIW - Spain, France, and Italy were all great places to be!)

    I'll try to explain what I mean. The varchar data type (assuming standard options) is designed to preserve trailing spaces. So if I assign 'ABC ' to a varchar column or variable, the three trailing spaces are kept, assuming that they are there for a reason. I like this behaviour. I still have the flexibility to get rid of it by using RTRIM(), but I would normally also assume that if someone takes the trouble to type those three extra spacecs, there's a reason for him/her.

    I don't like some of the choices made (either by the ANSI committee or by MS) on how those trailing spaces are handled. LEN() is one of those points. If we agree that those three trailing spaces are relevant, then why would their length not be relevant? And how can anyone explain to me that LEN(@a) is 3, LEN(@b) is 4, but LEN(@a + @b-2) is 10 instead of 7?

    As others have commented, there may be a connection with the behavior of the equality operator. I dislike that behavior too. If those trailing spaces are significant enough to store them, then why would they not be significant enough to affect equality? Or, reversed, if we truly think that 'ABC ' is equal to 'ABC', why not store them both in the same way?

    To reply to your direct question: "LEN isn't doing anything with padding except ignoring it" - LEN is not ignoring padding (which is the addition of extra spaces to arrive at a certain length); it is ignoring explicitly added and stored whitespace at the end.

    I will concede that there are good arguments for ignoring trailing whitespace, the best being that a human would ignore that trailing whitespace too (because in print, it is invisible). For me personally, I think the arguments in favor of retaining whitespace are stronger. But more to the point - those in favor of ignoring whitespace should advicate not storing it at all, rather than storing it and then disregarding it in some (but not all!) cases.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/