Worst Practices - Making Databases Case Sensitive (Or Anything Else)

  • You said in your article that you think C# is case sensitive.  For the most part, that's true.  The one place where it isn't true is on the data filter in the DataRowView object.  Even C# recognizes that when you are looking for "a", you are really looking for "a" or "A".  The filter does have a bug when compared to SQL.  "col like '%ab%'" will return the same results in both systems, but "col like '%a%b%'" works in SQL and at least with my version of C#, blows up with an exception.

    Obfuscation code actually takes advantage of that sensitivity to make C# code nearly unreadable by humans. ("aBJ6cx9A", "aBj6cx9A", "aB6Jcx9A" mean three different things in one scope, these same letter sequences can mean totally different things in another scope, and each can also be used both as a variable and several functions in the same scope.)

    I've seen human obfuscated code before.  Either they have a weird sense of humor, they were pressed for time, or they thought they were protecting their job.  Of course, the last didn't work because I was having to read that code.  Worse, I've read code I wrote a year or two earlier and had to scratch my head over what I was thinking.  Since then, I've gotten into the habit of commenting all code as I make it.  (Not come back to the "unneeded" part of coding when I "have time".)

    Personally, I like to see all SQL commands in upper case and user defined tables, fields, functions, etc. in mixed case.  If I worked in a shop that wanted all user defined stuff in upper case, I'd like to see all SQL code in lower case.  The guy who likes development stuff in lower case and production stuff in upper case seems like someone standing on the tracks and waiting for a case sensitive train to run them over.

    I DON'T want case sensitive data.  I like the fact that if I know the table name, I also don't have to remember the case in order to find it in a SQL query.  As I'm typing a command, I like the fact that I can just type the command, not worry if my case is correct.  When I search for something, I want to find it, not jump through hoops to find it.  (Like making sure my select statement doesn't pick up any null values when it's part of a "NOT IN" query.)

    Speaking of null values, if you REALLY want that text string in a field, set it as 'null' not as 'NULL'.  When I see that value in a select statement, it throws me a bit when I count the number of records where that field "IS NULL" and I get 0.

  • I've had the displeasure of working with a case-sensitive database myself, only because it was supporting a legacy application written in (gasp!) COBOL that originally used a case-sensitive flat file data store.  I'm now working for a different company, back to non-case-sensitive databases, and WHEW! what a relief!

  • I come from a Unix background and find case-insensitivity very limiting. I also remember my geography teacher telling me that 'turkey' is not the same as 'Turkey', and my english teacher telling me that 'may' is not the same as 'May'.

    Don't you just set some 'explicit' option, and declare all your variables beforehand to avoid problems? Or is this another case of language bastardisation due to the laziness of the user?

  • I think you need to make a clear distinction between the case sensitivity of data and the case sensitivity of the container it resides in.  I believe having the option to allow case sensitive searches of actual data is a good thing, whereas having case sensitive containers provides no benefit and indeed lead to obsfucation.

    I originally came from a case sensitive programming language environment and untill recently had been using a case insensitive language (VB and VB.NET). Although I've come full circle (my primary language is now C#), I'd have to say a case insensitive language makes more sense.

    Several years ago, myself and 2 others lost several days productivity due to someone turning on case sensivity for tables/columns in SQL Server (not the default setting for good reason!). This was compounded by the fact the column names were in German, so it was not obvious where the 'breaks' in title case occurred. I think it was that same 'HR/financials' package one poster alluded to !!!

  • We were told by our consultant to use case-sensitive SQL databases.  Now we have to constantly remind our users that when they put in "Fl", it's not like "FL".  Thanks for this article - now I know I'm not the only one out there...

  • Brilliant idea - let your users enter data that could (and should) be in a fk table and provided in a picklist! (golf clap) Even more brilliant than allowing a user to type in an invalid option - it sounds like you aren't doing input validation either!

    Not being sarcastic.. okayokay, I am being sarcastic but I haven't heard one good reason arguing in favor of case-insensitivty. Every complaint has been seemingly out of SHEER LAZINESS or inexperienced developers not understanding 1) input validation (something you should be doing anyways, regardless of case-sensitivty) or 2) that your personal standards are to type things in such a way that makes it confusing to remember (eg titlecase, gg).

    Assuming most of you are .net (asp/vb/c# whatever). If your IT department has standards for column naming and variable naming (part of a best practices/standards that your department should already have in place to some degree) you would have a leg up on not worrying about remembering the case of a function/variable/column whatever. Even if you didn't, you are MOST LIKELY USING VISUAL STUDIO, which will of course syntax highlight variables you're trying to use that were declared differently) and point most other error you could make due to mis-casing a word.


    -Ken

  • > I am being sarcastic but I haven't heard one good reason arguing in favor of case-insensitivty. Every complaint has been seemingly out of SHEER LAZINESS....

    That seems like a perspective issue to me.  To you, it's sheer laziness, perhaps.  To someone else, it might be not squandering unnecessary labor.  I think the whole point to this thread is, does case sensitivity buy anything that makes its (undeniable) added costs worthwhile?

    The concept of "no pain, no gain" leads many to suppose that the contrapositive must be true -- "if pain, then gain."  If it's hard to lift, then lifting it will make me stronger.  If it tastes bad, it must be good for you.  If it's hard to code, then I'm a better coder for making the effort.  Like I said, that's a perspective issue -- one I don't share.

  • What language are we storing in the db?  Case matters most to the people writing reports, so if your data is sEnSitive so must your Db.

    One official English Dictionary has these abbreviations:

      mt = mountain,

      Mt = Matthew (huh? guess I've been climbing very large bodies all my life),

      MT = metric ton or Montana or mountain time.

    Technical tidbit:  if any db is case sensitive, the tempdb must be too. Can anyone verify or refute that (for SQL 7.0 and higher) ?

  • Regarding Tom3w's question: "Technical tidbit: if any db is case sensitive, the tempdb must be too. Can anyone verify or refute that (for SQL 7.0 and higher)"

    In SQL 7.0 all databases had to have the same collation. So if you set the collation to one that's case sensitive then tempdb will get created with this setting also.

    Sql2k added the ability to specify collations on a per table (and even per column basis). Since they can now also be created the same way in tempdb, I can refute that for Sql2k


    -Ken

  • I'll have to agree that it's a matter of perspective and preference. My preference would be for object names to be both case sensitive (you have to use the original case to reference the object) and non sensitive (doesn't allow multiple objects with the same name but different case).

    I think it was sushila who mentioned the pain of reading someone else's code where case was seemingly random. Someone else (or several someones) mentioned that the extra effort of using original case needs to be weighed against the benefit. I often have to read / debug / reverse engineer other developers' code and from that perspective the extra effort I'm forced to exert seems like a heavy price compared with being forced to use the original case in the first place. If remembering the original case is a problem, having a standards / convention guide could mitigate it to some degree. And perhaps using a tool that autocompletes object names (I use ApexSQL Edit).

    In terms of data case sensitivity, I don't lean in either direction. The right tools for the right job. Having said that, I usually, if not always, default to case insensitivity.

    I enjoyed the article.

  • For most users 'A' = 'a'. Using case sensitive data can be a source of bugs from the users viewpoint.

    For large systens the maintenace of case sensitive source code is a bit hardly. Just try think about a ten year app coded by two dozen of programmers. Try adjust the database to case sensitive and wait for a disaster.

  • I think most of what could be said has been said. I'm glad we don't exist in a case insensitive world .. !!

    [font="Comic Sans MS"]The GrumpyOldDBA[/font]
    www.grumpyolddba.co.uk
    http://sqlblogcasts.com/blogs/grumpyolddba/

  • My 2 cents...  I've DBA'd at least one, maybe two case sensitive databases and got used to them.  It wasn't fun, but I think some of us find the cleaner data and naming conventions that follow to have some refreshing value.

    That said, I also agree with "why do it" now that we have the flexibility to get down to the column level with this issue.

    In programming, I prefer case sensitivity. Period!  To me keeping programming standards in line has always been difficult and case sensitivity helps here to get everyone doing the same "stuff" the same "way".

    I write mostly in C#, Javascript, and SQL.  All 3 are great, but I hate the lack of case sensitivity in SQL.  Good programmers will have excellent SQL code capitalized differently and my mind has to recognize the differences to absorb the meaning.  If you write mostly in one language, and don't have to maintain other folks code this is probably not a problem for you, but for me, I vote for case sensitivity.

    Now spelling is another problem...  Love my spull chicker

  • I realize this is a really old thread, but I read it with some great interest... not one of the folks that claimed that case sensitivity created that mythical 40% increase in speed coughed up any code that would prove their point... I'm just not seeing that type of improvement in SQL Server 2000... dunno... maybe it was that way back in 6.5 or something...

    --===== Create a table for the case sensitivity test

     CREATE TABLE #CaseSensitiveTest

            (

            RowNum INT IDENTITY(1,1) PRIMARY KEY,

            CSString CHAR(2) COLLATE SQL_Latin1_General_CP850_CS_AS,

            CIString CHAR(2) COLLATE SQL_Latin1_General_CP850_CI_AS

            )

    --===== Populate the table with random case sensitive data

     INSERT INTO #CaseSensitiveTest (CSString)

     SELECT TOP 1000000

            CSString = CHAR(STR(RAND(CAST(NEWID() AS VARBINARY))*2+65))

                     + CHAR(STR(RAND(CAST(NEWID() AS VARBINARY))*2+65))

       FROM Master.dbo.SysColumns sc1,

            Master.dbo.SysColumns sc2

    --===== Populate the table with the same but case insensitive data

     UPDATE #CaseSensitiveTest

        SET CIString = CSString

    --======================================================================

    --===== Declare a simple timer variable

    DECLARE @StartTime DATETIME

    --===== Test the performance of the case sensitive column

       DBCC FREEPROCCACHE

       DBCC DROPCLEANBUFFERS

        SET @StartTime = GETDATE()

     SELECT * FROM #CaseSensitiveTest

      WHERE CSString = 'BB'

      PRINT STR(DATEDIFF(ms,@StartTime,GETDATE())) + ' Duration CS'

    --===== Test the performance of the case insensitive column

       DBCC FREEPROCCACHE

       DBCC DROPCLEANBUFFERS

        SET @StartTime = GETDATE()

     SELECT * FROM #CaseSensitiveTest

      WHERE CIString = 'bb'

      PRINT STR(DATEDIFF(ms,@StartTime,GETDATE())) + ' Duration CI'

    DROP TABLE #CaseSensitiveTest

    Sometimes the CS wins, sometime the CI wins... sometimes they tie... I dunno... maybe I'm doing something wrong

    Personally, I have a strong dislike for case sensitivity, BEGIN/END, underscores, and semi-colons, if you catch my drift

    And, I don't care if you want to pay me ONE HUNDRED DOLLARS, one hundred dollars, or even oNe hUnDrEd dOlLaRs so long as the check clears 

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Like most blokes I am naturally insensitive - but that aside. Can anyone confirm - if the database is case insensitive is there any overhead in finding cached sp execution plans if the sp is referenced in a different case then the sp in the database? The answer may be in your article but I missed it. If so what is the performance hit and how does it compare with using/not using owner.sp when the sp is called? Is there the same impact on using the wrong case for tables inside sps? Shouls all tables insode an sp have the owner in front too?

Viewing 15 posts - 61 through 75 (of 85 total)

You must be logged in to reply to this topic. Login to reply