Worst Practices - Making Databases Case Sensitive (Or Anything Else)

Question

Worst Practices - Making Databases Case Sensitive (Or Anything Else)

Viewing 11 posts - 76 through 86 (of 86 total)

You must be logged in to reply to this topic. Login to reply

dukemus SSC Journeyman Points: 88 More actions · Answer 1

First of all, I have to tell you I am pro case sensitive. Why? First programs I've ever written were for Linux (C), so I was born with CS in my blood :).

Now let’s get back to SQL problem.

The article specified something about small loss of performance due to changing all letters in a query to same set (capitals or lowers doesn't matter) - for writing a query it is ok, not too much overhead ("I'm sure that takes a CPU cycle or two"), but consider “like” or “=” on a column in table with several million rows, joining this with other n tables... Your CPU has no idea of letters, it knows only numbers ('A' != 'a') so for each letter the software has to load a letter in a CPU register, check if is in not in, let’s say, capitals range and if it is not, it has to add a number to it and compare it to whatever it has to compare. I counted extra 3 operations on CPU complemented by corresponding memory access (read/write). Imagine doing this for all rows in a table (worst case) and for a number of letters for each row. Maybe for small applications it will be ok, but for large ones with hundreds of queries like this per second you have a problem.

One other thing that I could not understand from the article - why do you have to use triggers to check data? Why can't you use stored procedures to write data? In the stored procedure, before inserting a string in a column that will be searched you can apply upper() or lower() to the variable that contains the data and search procedures will do the same. This way u can apply unique indexes or whatever you like.

A case sensitive DB application will always work on a non case sensitive server if you make sure that you do not have columns or variables that differ only by casing. The other way is not true most of the time and I am telling you be experience - recently I have lost 2 hours to make McAfee virus DB to work on a case sensitive server (cheaper than buying new sql server licence) and is not the only time I have spent to correct commercial DB's for misspellings for CS.

For me case insensitive is for lazy people - hope nobody get's mad on me for this

Jeff Moden SSC Guru Points: 1004591 More actions · Answer 2

I don't think there's any overhead with CI... it's a difference of only 1 bit (32) and only an internal difference of what the mask looks like.

I love CI because I AM lazy... hope nobody gets mad it me for that

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

colin.Leversuch-Roberts SSC Guru Points: 52643 More actions · Answer 3

it was always claimed that the binary setting produced the performance gain - not quite the same as CS I think - not sure as I normally create CI servers - however I have a server to build so maybe I'll make it binary and see if I can see a difference. in a case insensitive collation getting the case of a procedure call incorrect still produces cache misses in the proc cache, this was like this in 2000 and is still the same in 2005

[font="Comic Sans MS"]The GrumpyOldDBA[/font]
www.grumpyolddba.co.uk
http://sqlblogcasts.com/blogs/grumpyolddba/

dahgigman Old Hand Points: 311 More actions · Answer 4

dahgigman

Old Hand

Points: 311

December 11, 2007 at 3:42 am

#758882

Thanks to all for your input on this

ksouak SSC Veteran Points: 252 More actions · Answer 5

I think that case sensitive database make use of binary sort order

that non case sensitive cannot.

It's more related with performance.

rick.foster Old Hand Points: 360 More actions · Answer 6

Hi Andy, Old topic of yours and all be it a small point; MS SQL has since at least SQL 4.2 and maybe farther back I understand that the binary collation is the fastest collation that MS SQL operates under and by definition it is case sensitive. I do not know the white papers on it but at least one large ERP uses it (PeopleSoft). You mentioned how SQL would do a comparison in underlying code by using UCASE. I am thinking not since I would suspect MS did not write separate code lines for each collations. I suspect the lowest reaches most likely are not using a UCASE solution as it would exclude matches for Binary and Case Sensitve collations. FWIW, Rick

colin.Leversuch-Roberts SSC Guru Points: 52643 More actions · Answer 7

I suspect that most people who claim the performance gain for binary have never actually read it up. The gain is there but only for selected searches/sorts and in most cases in a prod environment you'd not actually be using them. Can't remember where it is/was in msdn/technet but it makes interesting reading.

[font="Comic Sans MS"]The GrumpyOldDBA[/font]
www.grumpyolddba.co.uk
http://sqlblogcasts.com/blogs/grumpyolddba/

Jeff Moden SSC Guru Points: 1004591 More actions · Answer 8

I've found Zero applications that actually require case sensitivity... If I ever do, I'll change the collation on the columns that do rather than setting the whole database for it.

It's probably also worth mentioning that we have a pot wad of case sensitive Oracle servers at work... everyone I know ends up using WHERE UPPER(colname1) = UPPER(colname2) because of it... that's a little rough on index usage. Any so called gains in performance are, of course, quickly thrown to the ground because of such things.

Oddly enough, the people I know who actually do support case sensitivety are absolutely the worst when it comes to writing readable code... they do everything in lower case or upper case except for, of course, "things in quotes".

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

ksouak SSC Veteran Points: 252 More actions · Answer 9

Hi,

changing collation of only some columns get to some trouble

when comparing/joining two columns of different collation.

you need to use an explicit conversion which may lead to bad performance

(not using index due to column convertion)

http://www.sqlservercentral.com/articles/Administering/collate_part1/875/

colin.Leversuch-Roberts SSC Guru Points: 52643 More actions · Answer 10

yeah but you miss Jeff's point that, and I've seen this too, developers end up casting every character column to upper or lower which does screw things up big time.

Funnily enough the code written by the guys who gave my current client a case sensitive database would probably rate as some of the world's worst. I'm campaigning to get the collation of the server changed!

[font="Comic Sans MS"]The GrumpyOldDBA[/font]
www.grumpyolddba.co.uk
http://sqlblogcasts.com/blogs/grumpyolddba/

jswong05 Hall of Fame Points: 3553 More actions · Answer 11

In your argument, why do we have English sentence starting with a capital letter? Let us talk about not using that (bad practice) or just lower case everything in your email fl, tx, ma, wa, hello???

Why don't we put A, a on the same ASCII code?

Why don't we all drive on the right-hand side of the street? and use Metric system.:-P

Jason
http://dbace.us

Worst Practices - Making Databases Case Sensitive (Or Anything Else)

Cookies on SQLServerCentral