character ordering

Question

character ordering

TomThomson

SSC Guru

Points: 104773
More actions
October 31, 2012 at 10:10 pm

#264108

Comments posted to this topic are about the item character ordering
Tom
November 1, 2012 at 2:10 am

This was removed by the editor as SPAM

Viewing 15 posts - 1 through 15 (of 34 total)

You must be logged in to reply to this topic. Login to reply

honza.mf SSCertifiable Points: 5519 More actions · Answer 1

My first estimate was something near 75 (10 digits, 26 uppercase, 26 lowercase) but then I remembered the accents and found a table.

See, understand, learn, try, use efficient
© Dr.Plch

derek.colley SSCrazy Eights Points: 8040 More actions · Answer 2

As with most QotDs, I try and answer them, right or wrong, before researching the answer, so I can identify gaps in my knowledge rather than score points.

I'm afraid this question was just too obscure for me. It's a good question, sure, but why would I need to know this or reference it in any way? I mean, we have hundreds of databases, most of which actually use this collation, and this nugget has never, and will never come in useful.

It's a numbers question. It's a bit like asking, 'how many system tables exist in the MSDB after a vanilla installation of SQL Server 2005 with SP2 (on a full moon in the Northern Hemisphere while wearing Wellington boots and a silly hat, or any other irrelevance you care to name)?' - the answer can be found, but why would it be relevant to anyone in a DBA/BI/dev position?

I think my views are echoed in the results so far, one of the few QotDs where wrong answers (probably mostly guesses) outnumber right ones.

Sorry to be so harsh. I, too, am without my morning coffee.

---

Note to developers:
CAST(SUBSTRING(CAST(FLOOR(NULLIF(ISNULL(COALESCE(1,NULL),NULL),NULL)) AS CHAR(1)),1,1) AS INT) == 1
So why complicate your code AND MAKE MY JOB HARDER??!:crazy:

Want to get the best help? Click here https://www.sqlservercentral.com/articles/forum-etiquette-how-to-post-datacode-on-a-forum-to-get-the-best-help (Jeff Moden)
My blog: http://uksqldba.blogspot.com
Visit http://www.DerekColley.co.uk to find out more about me.

vk-kirov SSCertifiable Points: 7686 More actions · Answer 3

Note that the QOD query will return 139 characters if it runs on a database with the Latin1_General_CI_AS (or similar) collation. When running on a database with another collation, the results may vary. For example, for a Vietnamese_CI_AS database the query returns 131 characters, for a Cyrillic_General_CS_AS database – 158 characters, for a Japanese_CI_AS_KS_WS database – 122 characters, for a SQL_EBCDIC273_CP1_CS_AS (?!) database – 15 characters. But the answer given is correct though.

honza.mf SSCertifiable Points: 5519 More actions · Answer 4

derek.colley (11/1/2012)
I'm afraid this question was just too obscure for me. It's a good question, sure, but why would I need to know this or reference it in any way? I mean, we have hundreds of databases, most of which actually use this collation, and this nugget has never, and will never come in useful.

For example it's good to know there are accented letters in the area. There are both lowercase and uppercase letters.

Without tnis knowledge you are not able to understand string comparisons.

See, understand, learn, try, use efficient
© Dr.Plch

derek.colley SSCrazy Eights Points: 8040 More actions · Answer 5

For example it's good to know there are accented letters in the area. There are both lowercase and uppercase letters.
Without tnis knowledge you are not able to understand string comparisons.

I'm already aware of the existence of accented characters in this collation, without accented characters the collation would not need to be marked as AI or AS.

When doing string comparisons, I would likewise be aware of accents and case differences on letters and non-standard A-Z alphabetic characters anyway.

I would also consult a character map for the collation if comparing by ASCII / Unicode decimal or hex values (much easier than messing around with set-based or RBAR CHAR()-based SQL code).

Your point doesn't detract from mine, which was that this is a pointless question, reliant on the reader's ability to count.

---

Note to developers:
CAST(SUBSTRING(CAST(FLOOR(NULLIF(ISNULL(COALESCE(1,NULL),NULL),NULL)) AS CHAR(1)),1,1) AS INT) == 1
So why complicate your code AND MAKE MY JOB HARDER??!:crazy:

Want to get the best help? Click here https://www.sqlservercentral.com/articles/forum-etiquette-how-to-post-datacode-on-a-forum-to-get-the-best-help (Jeff Moden)
My blog: http://uksqldba.blogspot.com
Visit http://www.DerekColley.co.uk to find out more about me.

RossRoss SSCommitted Points: 1672 More actions · Answer 6

honza.mf (11/1/2012)
derek.colley (11/1/2012)
I'm afraid this question was just too obscure for me. It's a good question, sure, but why would I need to know this or reference it in any way? I mean, we have hundreds of databases, most of which actually use this collation, and this nugget has never, and will never come in useful.
For example it's good to know there are accented letters in the area. There are both lowercase and uppercase letters.
Without tnis knowledge you are not able to understand string comparisons.

I'm with Derek on this one, it's a good question but very irrelevant, and I was very surprised to see that I was in the largest percentage who also all got it wrong. I don't understand where this would come in useful, we all know this collation has accented characters which it is sensitive to, and uppercase and lowercase characters which it is insensitive to, the collation name tells us this; however for the sake of understanding string comparisons, we have no use for knowing how many characters are between 0 and Z, we just need to know the collation type and have an ascii table handy.

Paul White SSC Guru Points: 150467 More actions · Answer 7

derek.colley (11/1/2012)
I'm afraid this question was just too obscure for me. It's a good question, sure, but why would I need to know this or reference it in any way? I mean, we have hundreds of databases, most of which actually use this collation, and this nugget has never, and will never come in useful.

The specific answer isn't useful, but the process of writing a query to answer it surely could be. I had no idea of the correct answer so I wrote a query very similar to that given in the answer:

SELECT *

FROM dbo.Numbers AS n

WHERE

CHAR(n) COLLATE Latin1_General_CI_AS

BETWEEN '0' AND 'Z';

The idea of QotDs that require the reader to write T-SQL code to find the answer intrigues me. If that was Tom's intention (as I suspect it was) and/or to highlight the usefulness of a Numbers table, I applaud him.

Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi

honza.mf SSCertifiable Points: 5519 More actions · Answer 8

Ross.M (11/1/2012)
... however for the sake of understanding string comparisons, we have no use for knowing how many characters are between 0 and Z, we just need to know the collation type and have an ascii table handy.

This is one of the tricks of this question. Collation and ascii tables are two different things.

Collation ordering does not copy ascii ordering of characters, it's little bit more sofisticated.

TomThomson SSC Guru Points: 104773 More actions · Answer 9

SQL Kiwi (11/1/2012)
The specific answer isn't useful, but the process of writing a query to answer it surely could be. I had no idea of the correct answer so I wrote a query very similar to that given in the answer:
SELECT *
FROM dbo.Numbers AS n
WHERE
CHAR(n) COLLATE Latin1_General_CI_AS
BETWEEN '0' AND 'Z';
The idea of QotDs that require the reader to write T-SQL code to find the answer intrigues me. If that was Tom's intention (as I suspect it was) and/or to highlight the usefulness of a Numbers table, I applaud him.

You've spotted my secondary intentions. I thought an interesting change from providing code that people could cut and paste into a query window and run would be to make them write their own code to get the answer, and that some people might learn something from being pointed at Tally tables and Jeff's excellent article again.

But I also remembered developers who were very annoyed to discover that 0 to Z contained not only what they though of as "real" consonants (including the obvious German, French, and Spanish ones), "real" (single digit integer) numerics, and "real" vowels (the usual 5 plus versions with actute, grave, circumflex, and umlaut diacritics) but a host of other things; although I'm more of a developer than a DBA myself, their reactions (for example griping about the collation being senseless, claiming that there are no useful collations or that SQL is broken if it thinks '¾' is a numeric character or that there "should" only be 72 alphanumeric characters) managed to put me firmly into the foul-tempered DBA camp. So my primary intention was to have people discover that there are a lot more than 72 alphanumeric characters in the ascii character set with the default collation.

I think the count of answers so far makes it pretty clear that most people haven't a clue what characters fall in there - two thirds of responses have chosen one of the three lowest options: 36 (26 letters of the English alphabet, forgetting that there are two cases, plus 10 numeric digits) ,43 (36 plus 7: áâéèêîô), or 62 (English alphabet with two cases, ten numeric digits). So maybe two thirds of people who have seen the question have learnt something useful - not that the answer is 139 (who cares about the exact number, as long as they can find it if they ever need it), but that the answer is quite a lot bigger than 72.

Tom

Nils Gustav Stråbø SSChampion Points: 11259 More actions · Answer 10

vk-kirov (11/1/2012)
Note that the QOD query will return 139 characters if it runs on a database with the Latin1_General_CI_AS (or similar) collation. When running on a database with another collation, the results may vary. For example, for a Vietnamese_CI_AS database the query returns 131 characters, for a Cyrillic_General_CS_AS database – 158 characters, for a Japanese_CI_AS_KS_WS database – 122 characters, for a SQL_EBCDIC273_CP1_CS_AS (?!) database – 15 characters. But the answer given is correct though.

That is because the query in the answer is missing a COLLATE for the '0' in BETWEEN, or you can simply use SQL Kiwi's example.

RossRoss SSCommitted Points: 1672 More actions · Answer 11

honza.mf (11/1/2012)
Ross.M (11/1/2012)
... however for the sake of understanding string comparisons, we have no use for knowing how many characters are between 0 and Z, we just need to know the collation type and have an ascii table handy.
This is one of the tricks of this question. Collation and ascii tables are two different things.
Collation ordering does not copy ascii ordering of characters, it's little bit more sofisticated.

I understand that, my point though, is that the order and total are irrelevant, we never need to know how many characters are between 0 and Z on a particular collation.

Nils Gustav Stråbø SSChampion Points: 11259 More actions · Answer 12

Excellent question that requires knowledge of how collations work. 🙂

More of these Tom.

(Bob Brown) SSCrazy Points: 2705 More actions · Answer 13

(Bob Brown)

SSCrazy

Points: 2705

November 1, 2012 at 5:56 am

#1554964

I guess some people feel this was a good question?