Trailing spaces in WHERE-Clause

  • I have the following Tables

    [font="Courier New"]ItemTable

    Item; Name

    A_; Item A_

    A; Item A

    Stock

    Item; Stock

    A_; 25

    A; 14[/font]

    I have displayed a trailing space with an '_'.

    When I now want to join these two tables, i have a problem, i have trailing spaces that I have to. If i only use a standard WHERE clause, the Results a 4 Rows, not two, and therefore I get a wrong result.

    I have found the following Code sniplet in the internet, but I do not know how it performs.

    SELECT

    Stock.Item,

    ItemTable.Name,

    Stock.Stock

    FROM

    Stock

    INNER JOIN

    ItemTable

    ON ItemTable.Item = Stock.Item

    AND DATALENGTH(ItemTable.Item) = DATALENGTH(Stock.Item)

    What do you think. Is this a good way?

    Or is it better to make the field ItemTable.Item and Stock.Item to a COLLATE that will compare also this trailing spaces?

  • Using a binary collation was my initial thought. I suspect that adding the check for datalength will be slower than using the collation method but have not tested out that theory.

    Using a binary collation will handle the trailing spaces and may perform better than also checking the data length however it introduces other issues with respect to comparing upper-case versions of characters to their to lower-case couterparts.

    This post from a MS support team member summarizes it pretty well. It's from 2004 but I think it still holds true in current versions of SQL Server:

    Collations Reposted from http://www.sqlmonster.com/Uwe/Forum.aspx/sql-server-programming/22551/Collation in case that URL is taken down:


    John Smith - 13 Oct 2004 17:39 GMT

    I am using SQL Server 2000.

    When I run a query I want to differentiate between the data "E" and "E ".

    How to differentiate the data which has blank space after it and which does

    not have blank space after it? Thanks for your answers.

    Smith

    Reply to this Message


    Bart Duncan [MSFT] - 13 Oct 2004 18:09 GMT

    ANSI SQL-92 suggests that two strings which differ only in trailing spaces

    should be considered equal. In SQL Server, with most collations a simple

    '=' operator complies with this. In SQL there are only two exceptions to

    this that I know of:

    - Use 'LIKE' instead of '='. LIKE considers trailing spaces in the search

    criteria (right operand) to be significant. However, trailing blanks in

    the column to be searched (left operand) are still ignored. See the

    Remarks section of the "LIKE" topic in Books Online for details and some

    examples.

    - Use a binary collation such as SQL_Latin1_General_BIN. Note, however,

    that this will treat all code points as different characters. For example,

    'e' will be considered a different string than 'E'.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • opc.three (3/17/2011)


    Using a binary collation was my initial thought. I suspect that adding the check for datalength will be slower than using the collation method but have not tested out that theory.

    I tried a binary collation and it still treated the two strings as the same, but converting the strings to binary data did treat them differently. You will still have the same problem with case sensitivity.

    I think that there are very good reasons that the ANSI SQL-92 standard recommends that these strings be equivalent. How do you distinguish these two values in the following circumstances:

    * When stored in a CHAR/NCHAR field/variable

    * When output to a fixed width file

    * When output to a web interface

    * In a data entry screen

    Have you considered making your data ANSI SQL-92 compliant instead of trying to force SQL Server to deal with non-compliant data?

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • TEST! TEST! TEST! ...I should not have taken Bart's word on collations for granted! Thanks J. Drew. Why did you stop blogging?

    alexander.kaltenbrunner, you can do as J. Drew suggested and explore making your data compliant per ANSI-92...or you always have the DATALENGTH and VARBINARY(MAX) solutions available to evaluate.

    DECLARE @string_1 VARCHAR(10) = 'Item A ',

    @string_2 VARCHAR(10) = 'Item A' ;

    IF (@string_1 COLLATE SQL_Latin1_General_CP1_CI_AS = @string_2 COLLATE SQL_Latin1_General_CP1_CI_AS)

    PRINT 'equal with SQL_Latin1_General_CP1_CI_AS'

    ELSE

    PRINT 'not equal with SQL_Latin1_General_CP1_CI_AS'

    IF (@string_1 COLLATE Latin1_General_BIN = @string_2 COLLATE Latin1_General_BIN)

    PRINT 'equal with Latin1_General_BIN'

    ELSE

    PRINT 'not equal with Latin1_General_BIN'

    IF (@string_1 COLLATE Latin1_General_BIN2 = @string_2 COLLATE Latin1_General_BIN2)

    PRINT 'equal with Latin1_General_BIN'

    ELSE

    PRINT 'not equal with Latin1_General_BIN'

    IF (CONVERT(VARBINARY(MAX), @string_1) = CONVERT(VARBINARY(MAX), @string_2))

    PRINT 'equal as VARBINARY(MAX)'

    ELSE

    PRINT 'not equal as VARBINARY(MAX)'

    edit: "compliant per ANSI-92"

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply