Is There Any Reason I Should NOT enable READ_COMMITTED_SNAPSHOT?

  • I am trying to convince my managers to allow me to enable READ_COMMITTED_SNAPSHOT on the databases for our production data warehouse. I have two questions for you, my esteemed colleagues:

    1. Do you know of any good articles that describe the benefits of READ_COMMITTED_SNAPSHOT to a relatively non-technical audience? (It might be more accurate to say that they are not SQL Server experts. They do have technical expertise in their own areas.)

    2. What are the disadvantages of enabling READ_COMMITTED_SNAPSHOT? Does it expose me to errors that I would not face if I did not enable it? How large is the risk (in very general and relative terms)?

    Thank you, very much, for your input! 🙂

  • Disadvantage: More load (size and activity) on TempDB.



    Ole Kristian Velstadbråten Bangås - Virinco - Facebook - Twitter

    Concatenating Row Values in Transact-SQL[/url]

  • Disadvantages: if your application was not designed for optimistic locking, RCS could break some of your business logic or lead to incorrect results.

    Let me reverse your question for just one moment: you say this is a DataWarehouse, but RCS is usually a good fit for heavy loaded OLTP databases. What are the advantages that you are expecting from RCS?

    -- Gianluca Sartori

  • Sigh. :pinch: Where shall I begin? We have concurrency issues. (The root cause of these concurrency issues is poor design, but that - alas - is water under the bridge.)

    The previous solution to these concurrency issues was to use the WITH(NOLOCK) query hint in every single FROM and JOIN clause. This development approach was actually mandated by senior management. (Believe me: I wish I were joking.)

    At the time, I was not around to collapse in a fit of laughter. Since I have taken a solemn oath to not bathe until I have sought out and destroyed every single instance of this abomination, I wanted to provide our developers with an alternative. We still have concurrency issues, you see. :pinch:

  • NOLOCK? :sick:

    With RCS enabled writers don't block readers. Is this the kind of concurrency issue you're trying to address?

    As far as the dangers is concerned, you could read this: http://www.jimmcleod.net/blog/index.php/2009/08/27/the-potential-dangers-of-the-read-committed-snapshot-isolation-level/

    -- Gianluca Sartori

  • Gianluca Sartori (12/28/2011)


    NOLOCK? :sick:

    With RCS enabled writers don't block readers. Is this the kind of concurrency issue you're trying to address?

    As far as the dangers is concerned, you could read this: http://www.jimmcleod.net/blog/index.php/2009/08/27/the-potential-dangers-of-the-read-committed-snapshot-isolation-level/

    Yes, that is precisely the sort of concurrency issue we have. However, maybe I am better off simply doing a search and replace on all occurrences of WITH(NOLOCK). I'll just have to hunt down the individual performance problems that result. :pinch:

    I'll review the article. Thank you for the suggestion. 🙁

  • I am sorry but I am not comfortable with what you said. If it’s DWH database, it would have its own window for ETL tasks. Except for that period, the database won’t have any write activity. Why do you want to implement READ_COMMITTED_SNAPSHOT?

  • I can't say I'd use RCS on a warehouse under any circumstance I can think of.

    Normally, a warehouse has a load interval, and TablockX is usually more appropriate during that window. After the load window, you don't have anything except read locks, which allow concurrency.

    RCS is really only applicable to OLTP systems, not OLAP systems, in my experience.

    Even if someone had the bright idea of doing a "realtime data warehouse" with constant read/write concurrency, snapshot isolation of either flavor is likely to result in inconsistent query results as rows are versioned and unversioned, and locking would be better in that situation.

    Of course, RCS on such a warehouse instead of NoLock, probably (almost certainly) is an improvement. But it's a band aid on a bleeding artery.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • In a warehouse, TBH Read Uncommitted or NO LOCK is not as bad as everyone says. OK, I have said it. You can all start cussing me... 🙂

    -Roy

  • Well, as I research the issue further, I am beginning to agree with all of you. RCSI is not going to help, and it just might make things worse.

    I'll say that this is not your typical data warehouse. The original developers picked up on the idea that data warehouse structure is different than an OLTP database structure. However, they interpreted this to mean that no rules applied at all.

    So, now I have things like massive tables with non-atomic fields or Entity-Value-Attribute structures. Every time I point out the problems such things can cause, someone says, "But this isn't an OLTP database." :pinch:

  • David Moutray (12/28/2011)


    ...

    So, now I have things like massive tables with non-atomic fields or Entity-Value-Attribute structures. Every time I point out the problems such things can cause, someone says, "But this isn't an OLTP database." :pinch:

    Let me point you to another David...

    http://www.sqlservercentral.com/Forums/Topic1225577-391-1.aspx

  • I don't see how RCS would be a problem in a data warehouse\datamart application.

    If there are no inserts/updates/deletes happening, then there are no row versions being generated in the version store in tempdb, so there would be no additional load there. Select queries would not create locks, except for a schema stability lock.

    If the data warehouse is being updated while queries are being run, then single select statements would be able to run with consistent results without being blocked. For multiple statement queries, then SNAPSHOT isolation could be used to maintain consistency without being blocked. In either case, the consistency of the results would be far better than using NOLOCK, which is the current situation.

  • Dev (12/28/2011)Let me point you to another David...

    http://www.sqlservercentral.com/Forums/Topic1225577-391-1.aspx

    David Portas makes a good point in his post. (I was following that thread until you guys started talking about drawers. The analogies became too thick for me to breathe properly. :-))

    I think I might enjoy working with a normalized data warehouse. I think Star-Schema is also a valid model, in that it is optimized for reporting.

    The data warehouse with which I currently work does not follow any coherent model. Therein lies my problem. :pinch:

  • Roy Ernest (12/28/2011)


    In a warehouse, TBH Read Uncommitted or NO LOCK is not as bad as everyone says. OK, I have said it. You can all start cussing me... 🙂

    No, I agree with you. In a database that's loaded purely by a scheduled ETL process (or more than one), there's no reason to worry about dirty reads, except during the load process. Lock the database (single-user) during that time period, and you can set all the queries in it to Read Uncommitted isolation level, and you won't hurt anything.

    Generally speaking, you won't help anything either, because read locks can share resources, so all you're really doing by allowing NoLock/RU is making that explicit instead of implicit.

    Another option, Enterpise Edition only, is to have OLAP databases use actual snapshots for querying. Then you don't have to lock them during loads, since the queries will be reading from the snapshot anyway. The snapshots can grow pretty big as a lot of data changes, but that's often not a problem if you re-generate the snapshot before each ETL load. Tricky to do it right, since you have to manage connections so they go to the most recent snapshot dynamically, but works well if you get it working right.

    Any of these are just ways to avoid blocking issues in OLAP. But snapshot isolation shouldn't really be needed in OLAP, since snapshot isolation is about allowing optimistic read locking to avoid conflicts with concurrent, frequent updates. OLAP databases generally don't have a lot of update concurrency with their reads. That's the whole point of OLAP, actually.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • David Moutray (12/28/2011)


    Well, as I research the issue further, I am beginning to agree with all of you. RCSI is not going to help, and it just might make things worse.

    I'll say that this is not your typical data warehouse. The original developers picked up on the idea that data warehouse structure is different than an OLTP database structure. However, they interpreted this to mean that no rules applied at all.

    So, now I have things like massive tables with non-atomic fields or Entity-Value-Attribute structures. Every time I point out the problems such things can cause, someone says, "But this isn't an OLTP database." :pinch:

    What they really mean is, "this isn't a database that we know what we're talking about, so shut up and leave us in our ignorant bliss!"

    Honestly, EAV tables are about making it possible for lazy developers to build OO mapping into RDB structures, without regard to data integrity, performance, et al.

    I recently had to fix an EAV structure in an OLAP database. The data is loaded a few times a day from an OLTP source, and the original devs put it all in EAV tables. The web pages it was feeding "worked fine in our test environment and for our other customers, where we only have six or seven rows of data to deal with". In our case, it was taking web pages up to a minute to load. Turned the data into a proper relational structure, and the query time went down from most of a minute (average) to 3 milliseconds average.

    The devs decided I had done some sort of black magic on it that was WAY out of the normal bounds of what any normal person could possibly know. Not because I had done so, but because deciding that meant they didn't have to take the time and effort to learn why it worked that way. (Third party vendor devs, not our in-house guys. The in-house guys are all much smarter than that. I work in a really good team.)

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

Viewing 15 posts - 1 through 15 (of 70 total)

You must be logged in to reply to this topic. Login to reply