Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««1234»»»

character ordering Expand / Collapse
Author
Message
Posted Thursday, November 1, 2012 5:24 AM


SSCrazy Eights

SSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy Eights

Group: General Forum Members
Last Login: Today @ 2:23 PM
Points: 8,562, Visits: 9,059
SQL Kiwi (11/1/2012)
The specific answer isn't useful, but the process of writing a query to answer it surely could be. I had no idea of the correct answer so I wrote a query very similar to that given in the answer:

SELECT * 
FROM dbo.Numbers AS n
WHERE
CHAR(n) COLLATE Latin1_General_CI_AS
BETWEEN '0' AND 'Z';

The idea of QotDs that require the reader to write T-SQL code to find the answer intrigues me. If that was Tom's intention (as I suspect it was) and/or to highlight the usefulness of a Numbers table, I applaud him.

You've spotted my secondary intentions. I thought an interesting change from providing code that people could cut and paste into a query window and run would be to make them write their own code to get the answer, and that some people might learn something from being pointed at Tally tables and Jeff's excellent article again.

But I also remembered developers who were very annoyed to discover that 0 to Z contained not only what they though of as "real" consonants (including the obvious German, French, and Spanish ones), "real" (single digit integer) numerics, and "real" vowels (the usual 5 plus versions with actute, grave, circumflex, and umlaut diacritics) but a host of other things; although I'm more of a developer than a DBA myself, their reactions (for example griping about the collation being senseless, claiming that there are no useful collations or that SQL is broken if it thinks '¾' is a numeric character or that there "should" only be 72 alphanumeric characters) managed to put me firmly into the foul-tempered DBA camp. So my primary intention was to have people discover that there are a lot more than 72 alphanumeric characters in the ascii character set with the default collation.

I think the count of answers so far makes it pretty clear that most people haven't a clue what characters fall in there - two thirds of responses have chosen one of the three lowest options: 36 (26 letters of the English alphabet, forgetting that there are two cases, plus 10 numeric digits) ,43 (36 plus 7: áâéèêîô), or 62 (English alphabet with two cases, ten numeric digits). So maybe two thirds of people who have seen the question have learnt something useful - not that the answer is 139 (who cares about the exact number, as long as they can find it if they ever need it), but that the answer is quite a lot bigger than 72.


Tom
Post #1379743
Posted Thursday, November 1, 2012 5:27 AM


SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Friday, June 6, 2014 7:58 AM
Points: 1,837, Visits: 3,420
vk-kirov (11/1/2012)
Note that the QOD query will return 139 characters if it runs on a database with the Latin1_General_CI_AS (or similar) collation. When running on a database with another collation, the results may vary. For example, for a Vietnamese_CI_AS database the query returns 131 characters, for a Cyrillic_General_CS_AS database – 158 characters, for a Japanese_CI_AS_KS_WS database – 122 characters, for a SQL_EBCDIC273_CP1_CS_AS (?!) database – 15 characters. But the answer given is correct though.


That is because the query in the answer is missing a COLLATE for the '0' in BETWEEN, or you can simply use SQL Kiwi's example.
Post #1379745
Posted Thursday, November 1, 2012 5:29 AM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Tuesday, July 22, 2014 9:25 AM
Points: 279, Visits: 524
honza.mf (11/1/2012)
Ross.M (11/1/2012)
... however for the sake of understanding string comparisons, we have no use for knowing how many characters are between 0 and Z, we just need to know the collation type and have an ascii table handy.

This is one of the tricks of this question. Collation and ascii tables are two different things.
Collation ordering does not copy ascii ordering of characters, it's little bit more sofisticated.


I understand that, my point though, is that the order and total are irrelevant, we never need to know how many characters are between 0 and Z on a particular collation.
Post #1379746
Posted Thursday, November 1, 2012 5:31 AM


SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Friday, June 6, 2014 7:58 AM
Points: 1,837, Visits: 3,420
Excellent question that requires knowledge of how collations work.
More of these Tom.
Post #1379749
Posted Thursday, November 1, 2012 5:56 AM


Say Hey Kid

Say Hey KidSay Hey KidSay Hey KidSay Hey KidSay Hey KidSay Hey KidSay Hey KidSay Hey Kid

Group: General Forum Members
Last Login: Thursday, June 12, 2014 4:19 AM
Points: 701, Visits: 1,145
I guess some people feel this was a good question?
Post #1379762
Posted Thursday, November 1, 2012 6:05 AM


Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Tuesday, July 1, 2014 6:20 AM
Points: 1,339, Visits: 1,312
Ross.M (11/1/2012)
honza.mf (11/1/2012)
Ross.M (11/1/2012)
... however for the sake of understanding string comparisons, we have no use for knowing how many characters are between 0 and Z, we just need to know the collation type and have an ascii table handy.

This is one of the tricks of this question. Collation and ascii tables are two different things.
Collation ordering does not copy ascii ordering of characters, it's little bit more sofisticated.


I understand that, my point though, is that the order and total are irrelevant, we never need to know how many characters are between 0 and Z on a particular collation.

See Tom's answer above. Yes the number is pointless. The methods how to obtain it are important.




See, understand, learn, try, use efficient
© Dr.Plch
Post #1379764
Posted Thursday, November 1, 2012 6:17 AM
SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: Tuesday, July 15, 2014 6:09 AM
Points: 846, Visits: 471
I'd agree that the intent was more important than the answer. Well done.

------------
Buy the ticket, take the ride. -- Hunter S. Thompson
Post #1379772
Posted Thursday, November 1, 2012 6:51 AM
Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Friday, May 9, 2014 12:47 AM
Points: 3,448, Visits: 4,407
Nils Gustav Stråbø (11/1/2012)
vk-kirov (11/1/2012)
Note that the QOD query will return 139 characters if it runs on a database with the Latin1_General_CI_AS (or similar) collation. When running on a database with another collation, the results may vary. For example, for a Vietnamese_CI_AS database the query returns 131 characters, for a Cyrillic_General_CS_AS database – 158 characters, for a Japanese_CI_AS_KS_WS database – 122 characters, for a SQL_EBCDIC273_CP1_CS_AS (?!) database – 15 characters. But the answer given is correct though.


That is because the query in the answer is missing a COLLATE for the '0' in BETWEEN, or you can simply use SQL Kiwi's example.

Not so simple
Try the following code (based on Paul's query):
CREATE DATABASE qod_collation_db COLLATE Japanese_CI_AS_KS_WS;
GO
USE qod_collation_db;
GO
WITH Numbers AS
( SELECT 0 AS n
UNION ALL
SELECT n + 1
FROM Numbers
WHERE n <= 255
)
SELECT n AS code, CHAR(n) AS symbol
FROM Numbers AS n
WHERE CHAR(n) COLLATE Latin1_General_CI_AS BETWEEN '0' AND 'Z'
OPTION(MAXRECURSION 256);
GO
USE master;
GO
DROP DATABASE qod_collation_db;
GO

It returns 62 characters. With the Cyrillic_General_CS_AS collation, you'll get 63 characters; with Vietnamese_CI_AS – 131 etc.
Post #1379784
Posted Thursday, November 1, 2012 8:06 AM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Monday, July 21, 2014 8:35 AM
Points: 3,642, Visits: 72,429
SQL Kiwi (11/1/2012)
derek.colley (11/1/2012)
I'm afraid this question was just too obscure for me. It's a good question, sure, but why would I need to know this or reference it in any way? I mean, we have hundreds of databases, most of which actually use this collation, and this nugget has never, and will never come in useful.

The specific answer isn't useful, but the process of writing a query to answer it surely could be. I had no idea of the correct answer so I wrote a query very similar to that given in the answer:

SELECT * 
FROM dbo.Numbers AS n
WHERE
CHAR(n) COLLATE Latin1_General_CI_AS
BETWEEN '0' AND 'Z';

The idea of QotDs that require the reader to write T-SQL code to find the answer intrigues me. If that was Tom's intention (as I suspect it was) and/or to highlight the usefulness of a Numbers table, I applaud him.


What's interesting for me is that the query from the original question returns 120 rows.
select CHAR(I),I from Tally 
where char(I) between '0' and 'Z' collate latin1_general_ci_as
and I < 256 order by CHAR(I)

However, your query gives 139. I also realize why here (one's apply the collation to 0 thru Z the other to the result of the CHAR function. But I did find that interesting.




--Mark Tassin
MCITP - SQL Server DBA
Proud member of the Anti-RBAR alliance.
For help with Performance click this link
For tips on how to post your problems
Post #1379828
Posted Thursday, November 1, 2012 8:09 AM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Monday, July 21, 2014 8:35 AM
Points: 3,642, Visits: 72,429
Oh and of course I got it wrong... because I just went with
SELECT ASCII('Z')-ASCII('0')

Sadly I'm just too dang American... A thru Z, 0 thru 9... sure... but ASCII isn't Latin1_General_CI_AS




--Mark Tassin
MCITP - SQL Server DBA
Proud member of the Anti-RBAR alliance.
For help with Performance click this link
For tips on how to post your problems
Post #1379829
« Prev Topic | Next Topic »

Add to briefcase ««1234»»»

Permissions Expand / Collapse