LIKE a vowel

  • So to correctly test for just those characters should you use:

    @x IN ('a','e','i','o','u') ?

    Good question, an undocumented 'feature' that could lead to unexpected results.

  • Another two that could come up are:

    Þ - the archaic letter thorn - which will come up with 'th' in the select.

    ß - the German "Sharp S" - which will come up with "ss' in the select.

    A few more "undocumented features"...

    I'm guessing on the collation, there are lots of other "opportunities".

    Jim

  • jdamm (12/6/2012)


    Another two that could come up are:

    Þ - the archaic letter thorn - which will come up with 'th' in the select.

    ß - the German "Sharp S" - which will come up with "ss' in the select.

    A few more "undocumented features"...

    I'm guessing on the collation, there are lots of other "opportunities".

    Jim

    You've pushed one of my go buttons there - languages and orthography used to be a big thing for me. (Languages still are.) So now I shall be boring, but perhaps informative.

    In the ISO Latin1 8-bit character set, there are only 5 ligatures: ae,th, AE,TH, and ss.

    Of course this means that Latin1 is missing rather a lot of ligatures which are used by languages which use "roman" characters for writing. Windows Latin1 adds oe and OE (so the French probably prefer Windows Latin1 to ISO, as these are essential for standard French spelling) , so in SQL Server with default collation (or really default code page) we see 7 ligature digraphs in the single byte character set. That's still missing ij and IJ (maybe Hugo will tell us whether Dutch generally uses these ligatures or has generally switched to non-ligatured representation).

    There were once other ligatures in languages that use Latin1 (or a small extension of it). Until a couple of decades ago, Spanish treated CH ch Ll and ll each as an individual character, not each as two characters; German printers once used a ue ligature (originally with the e above the u, instead of to the right of it) in place of ü, but that is not used now (at least I believe not: but people with old typewriters which can't produce ü still sometimes use ue - as two separated characters, not a ligature).

    There 49 (? not sure; I think it's 53 altogether, and as 4 are ligatures .... but I'm not sure 53 is right) other (not ligature) characters used by languages using variants of the roman alphabet that are not in ISO Latin1 (most of them are in Windows Latin1) which has thrown away some obsolete/redundant control characters to make room for them), but they are not ligatures so won't be picked up as matching two adjacent characters. Ten of the 48 are needed for Latin (in the orthography that has vowel length marked), which suggests that Latin1 is a bit of a misnomer.

    Tom

  • L' Eomot Inversé (12/6/2012)


    That's still missing ij and IJ (maybe Hugo will tell us whether Dutch generally uses these ligatures or has generally switched to non-ligatured representation).

    When I was young, Dutch typewriters (and -when I was slightly less young- even some Dutch computer keyboards) had a seperate key for the ligature ij. Rather obviously, actually, since the ij is the 25th letter of the Dutch alphabet. And in case you wonder where we then have left the y - we don't. The y is not officially a letter of the Dutch alphabet, even though we do use it in some words, such a yoghurt. Yes, I agree that this is weird.

    (By the way, when I tried to find confirming sources for this on the internet, I could not. All pages I visited either claim that the y is the 25th letter, or list both ij as 25th and y as 26th letter. This is definitely NOT how I learned it in elementary school!)

    However, that is now a thing of the past. Nobody uses ligature ij anymore. (And frankly, in a proportional font you would not see the difference anyway!) That is probably why all internet pages about the Dutch alphabet list y as the 25th letter instead of ligature ij.

    German printers once used a ue ligature (originally with the e above the u, instead of to the right of it) in place of ü, but that is not used now (at least I believe not: but people with old typewriters which can't produce ü still sometimes use ue - as two separated characters, not a ligature).

    As far as I know, German always considers an e after a vowel equivalent to an unlaut above that vowel - so ae, oe and ue are equivalent to ä, ö and ü. (German never uses an umlaut above an i or e, so ee and ie are just plain ee and ie).


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • At first I was a little miffed about missing the question - especially since there wasn't a hard reference.

    Then after reading the discussions and experimenting with Thorn and Sharp S, I came to like the question. I definitely learned a bit today that could (but probably no chance that it will) catch me in the future.

    Jim

  • simply I selected one to see the explaination and understand the qstn. Really i didn't understand what the qstn is delaing abt...

    Can anyone xplain in detail. If already xplained pls provide me the refernce..

    Thanks

    --
    Dineshbabu
    Desire to learn new things..

  • Dineshbabu (1/7/2013)


    simply I selected one to see the explaination and understand the qstn. Really i didn't understand what the qstn is delaing abt...

    Can anyone xplain in detail. If already xplained pls provide me the refernce..

    Thanks

    Did you read all the posts so far in this topic?

    The short story - in some languages, Æ and æ are official letters. They can also be written as AE and ae. Because of that, 'Æ' LIKE '[AEIOU]' will return true - because the first two letters in the LIKE string (A and E) are considered to represent the Æ character.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • Hugo Kornelis (1/7/2013)


    Dineshbabu (1/7/2013)


    simply I selected one to see the explaination and understand the qstn. Really i didn't understand what the qstn is delaing abt...

    Can anyone xplain in detail. If already xplained pls provide me the refernce..

    Thanks

    Did you read all the posts so far in this topic?

    The short story - in some languages, Æ and æ are official letters. They can also be written as AE and ae. Because of that, 'Æ' LIKE '[AEIOU]' will return true - because the first two letters in the LIKE string (A and E) are considered to represent the Æ character.

    Thanks Hugo.

    I understood something. I will try to go through all the xplainations posted here. I think, First of all i shud understand Coallate.

    --
    Dineshbabu
    Desire to learn new things..

  • Thanks for such a thoughtful Question.

Viewing 9 posts - 31 through 38 (of 38 total)

You must be logged in to reply to this topic. Login to reply