Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

Comparing ss and ß using LIKE Expand / Collapse
Author
Message
Posted Thursday, June 21, 2007 5:10 AM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Tuesday, September 09, 2008 1:14 PM
Points: 26, Visits: 6
In the Latin1_General collation, ss is treated like ß (German sharp s) in comparisons:

'ss' = 'ß' --> true

Note that this is also true with LIKE:

'ss' LIKE 'ß' --> true

But if a '%' is part of the search pattern, things become complicated:

'ss' LIKE 'ß%' --> true
'ß' LIKE 'ss%' --> false (!!!)

Notably, the following are all true:

'ß' LIKE '%ss' --> true
'ß' LIKE '%ss%' --> true
'ß' LIKE 'ss%%' --> true
'ßa' LIKE 'ss%' --> true

Take a look at the tests with two ß:

'ßß' LIKE 'ssss%' --> false (guessed that)
'ßßa' LIKE 'ssss%' --> false (now that's a surprise)
'ßßaa' LIKE 'ssss%' --> true

Alright, there's a pattern in this behavior:

'ßßßaa' LIKE 'ssssss%' --> false
'ßßßaaa' LIKE 'ssssss%' --> true

My guess is that LIKE is optimized for the special case of patterns that end with '%' and contain no other wildcards. It seems that the comparison always fails if the searched text is shorter than the non-wildcard part of the pattern.

My questions are:

1. Is this behavior a bug or a feature?
2. Are there other subtleties in the behavior of LIKE?
3. Is the precise algorithm for LIKE comparison documented somewhere?

I need to rebuild the comparison routine in C as accurately as possible. I've tried SQL Server 2000 and 2005.

I appreciate any help on this.
Post #375531
Posted Monday, August 25, 2008 2:50 AM


SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Monday, August 01, 2011 8:54 PM
Points: 177, Visits: 455
Hi there...

I found an answer to your problem

Convert them to Nvarchar.

Ok ok, I knowusing CAST or CONVERT would add some more code... but you can do something like this

IF N'ßßa' LIKE N'ssss%'
SELECT 'TRUE'

or you could just put the N'' to only one of the two...

if you start a string with an N automatically converts it into an NVARCHAR Format

Correct me if I'm wrong... but I think its because of the difference in the bytes used... uhmmm like ASCII??? actually I'm not really sure of the reason :P

well atleast it worked


_____________________________________________
Quatrei Quorizawa
:):D:P;)
MABUHAY PHILIPPINES!

"Press any key...
Where the heck is the any key?
hmmm... Let's see... there's ESC, CTRL, Page Up...
but no any key"
- Homer Simpson
Post #558008
Posted Tuesday, August 26, 2008 7:25 AM
SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Today @ 1:20 PM
Points: 419, Visits: 1,264
Kirill Müller (6/21/2007)
In the Latin1_General collation, ss is treated like ß (German sharp s) in comparisons:

Apart from your specific problems regarding the LIKE operator, I'm very surprised that ss is treated like ß. Thus in is the sequence:

create table t(c varchar(10))
insert into t values('ß')
insert into t values('ss')

select *
from t
where c='ss'

select *
from t
where c='ß'

both records always get selected.
I was under the impression that ß can 'optionally' be written as 'ss' but this is not true the other way around. In any case, I just don't see the ability to make this distinction built into a collation sequence.
Post #558858
Posted Friday, September 05, 2008 4:41 AM
SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: Friday, January 06, 2012 9:09 AM
Points: 892, Visits: 852
I was under the impression that ß can 'optionally' be written as 'ss' but this is not true the other way around.

If I remember correctly, under the German spelling reform in the 1990s, all occurrences of ß were changed to ss. This spelling reform had many problems (mostly a lack of popularity), and was officially rejected/changed a while later. So many words were changed back to being officially spelt with ß. Hard work for teachers. I do agree with the quote above, though.
Post #564441
Posted Friday, September 05, 2008 4:46 AM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Tuesday, September 09, 2008 1:14 PM
Points: 26, Visits: 6
No, actually, the ß is used after "long" vowels (like in "Fuß" which is pronounced like "fooss") and after diphthongs (like "außen"). All other instances of ß have been replaced by ss, the most popular word that has been changed is "dass" (translated: "so that").

Thanks for your replies!
Post #564446
Posted Monday, September 08, 2008 6:01 AM
SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: Friday, January 06, 2012 9:09 AM
Points: 892, Visits: 852
Hi Kirill,

Thanks for the explanation; I am trying to learn German, but I am finding it hard work. Bring back Esperanto :)
Post #565311
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse