March 17, 2011 at 1:47 am
I have the following Tables
[font="Courier New"]ItemTable
Item; Name
A_; Item A_
A; Item A
Stock
Item; Stock
A_; 25
A; 14[/font]
I have displayed a trailing space with an '_'.
When I now want to join these two tables, i have a problem, i have trailing spaces that I have to. If i only use a standard WHERE clause, the Results a 4 Rows, not two, and therefore I get a wrong result.
I have found the following Code sniplet in the internet, but I do not know how it performs.
SELECT
Stock.Item,
ItemTable.Name,
Stock.Stock
FROM
Stock
INNER JOIN
ItemTable
ON ItemTable.Item = Stock.Item
AND DATALENGTH(ItemTable.Item) = DATALENGTH(Stock.Item)
What do you think. Is this a good way?
Or is it better to make the field ItemTable.Item and Stock.Item to a COLLATE that will compare also this trailing spaces?
March 17, 2011 at 1:59 pm
Using a binary collation was my initial thought. I suspect that adding the check for datalength will be slower than using the collation method but have not tested out that theory.
Using a binary collation will handle the trailing spaces and may perform better than also checking the data length however it introduces other issues with respect to comparing upper-case versions of characters to their to lower-case couterparts.
This post from a MS support team member summarizes it pretty well. It's from 2004 but I think it still holds true in current versions of SQL Server:
Collations Reposted from http://www.sqlmonster.com/Uwe/Forum.aspx/sql-server-programming/22551/Collation in case that URL is taken down:
John Smith - 13 Oct 2004 17:39 GMTI am using SQL Server 2000.
When I run a query I want to differentiate between the data "E" and "E ".
How to differentiate the data which has blank space after it and which does
not have blank space after it? Thanks for your answers.
Smith
Reply to this Message
Bart Duncan [MSFT] - 13 Oct 2004 18:09 GMTANSI SQL-92 suggests that two strings which differ only in trailing spaces
should be considered equal. In SQL Server, with most collations a simple
'=' operator complies with this. In SQL there are only two exceptions to
this that I know of:
- Use 'LIKE' instead of '='. LIKE considers trailing spaces in the search
criteria (right operand) to be significant. However, trailing blanks in
the column to be searched (left operand) are still ignored. See the
Remarks section of the "LIKE" topic in Books Online for details and some
examples.
- Use a binary collation such as SQL_Latin1_General_BIN. Note, however,
that this will treat all code points as different characters. For example,
'e' will be considered a different string than 'E'.
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
March 17, 2011 at 2:36 pm
opc.three (3/17/2011)
Using a binary collation was my initial thought. I suspect that adding the check for datalength will be slower than using the collation method but have not tested out that theory.
I tried a binary collation and it still treated the two strings as the same, but converting the strings to binary data did treat them differently. You will still have the same problem with case sensitivity.
I think that there are very good reasons that the ANSI SQL-92 standard recommends that these strings be equivalent. How do you distinguish these two values in the following circumstances:
* When stored in a CHAR/NCHAR field/variable
* When output to a fixed width file
* When output to a web interface
* In a data entry screen
Have you considered making your data ANSI SQL-92 compliant instead of trying to force SQL Server to deal with non-compliant data?
Drew
J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
March 17, 2011 at 3:15 pm
TEST! TEST! TEST! ...I should not have taken Bart's word on collations for granted! Thanks J. Drew. Why did you stop blogging?
alexander.kaltenbrunner, you can do as J. Drew suggested and explore making your data compliant per ANSI-92...or you always have the DATALENGTH and VARBINARY(MAX) solutions available to evaluate.
DECLARE @string_1 VARCHAR(10) = 'Item A ',
@string_2 VARCHAR(10) = 'Item A' ;
IF (@string_1 COLLATE SQL_Latin1_General_CP1_CI_AS = @string_2 COLLATE SQL_Latin1_General_CP1_CI_AS)
PRINT 'equal with SQL_Latin1_General_CP1_CI_AS'
ELSE
PRINT 'not equal with SQL_Latin1_General_CP1_CI_AS'
IF (@string_1 COLLATE Latin1_General_BIN = @string_2 COLLATE Latin1_General_BIN)
PRINT 'equal with Latin1_General_BIN'
ELSE
PRINT 'not equal with Latin1_General_BIN'
IF (@string_1 COLLATE Latin1_General_BIN2 = @string_2 COLLATE Latin1_General_BIN2)
PRINT 'equal with Latin1_General_BIN'
ELSE
PRINT 'not equal with Latin1_General_BIN'
IF (CONVERT(VARBINARY(MAX), @string_1) = CONVERT(VARBINARY(MAX), @string_2))
PRINT 'equal as VARBINARY(MAX)'
ELSE
PRINT 'not equal as VARBINARY(MAX)'
edit: "compliant per ANSI-92"
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
Viewing 4 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply