Loading a 24x7 Data Warehouse

  • Leo Peysakhovich

    Hall of Fame

    Points: 3880

    Comments posted here are about the content posted at http://www.sqlservercentral.com/columnists/lPeysakhovich/2901.asp

  • Deb Chat

    SSC Rookie

    Points: 33

    Undoutedly one of the fine piece of work.  Great work Leo.  -  Debasish

  • Ian Yates

    SSCoach

    Points: 19738

    When you gave the problem description your solution was what was in my head - good to know that I at least thought of a real-world solution!

  • pshotts

    Right there with Babe

    Points: 784

    I have an idea that with SQL Server 2005 you could accomplish something like this using a database snapshot.

    Has anybody actually tried that?

  • VincentRainardi

    SSCrazy

    Points: 2905

    Thanks you for sharing your experience Leo. It's a good and useful article. We use a data warehouse where the data from stage is loaded alternately into 2 sets of identical star schema (the difference is just the suffix at the end of table names) and there are a set of views which select from those sets. The downtime is minimal, just to drop and recreate the views to point to the just loaded set of tables. We have been using this for 6 years. Regards, Vincent

  • Sorin Petcu

    Hall of Fame

    Points: 3145

    Good job, Leo!

    Finnaly, I found an stored procedure which kills the db connections, KILL_DB_CONNECTIONS. But, not in the end, your logical schema was one of mine used in production at one of my customers. Thank you to confirm that what I thought was in good shape.

    In Theory, theory and practice are the same...In practice, they are not.
  • EdVassie

    SSC Guru

    Points: 60274

    What is the size of your DW, and how long does the backup and the restore take?  I would be concerned that as your DW size grows, you may run out of time to do the backup/restore cycle.

    What are your thoughts on using a SQL 2005 database snapshot in place of the backup/restore.

    Original author: https://github.com/SQL-FineBuild/Common/wiki/ 1-click install and best practice configuration of SQL Server 2019, 2017 2016, 2014, 2012, 2008 R2, 2008 and 2005.

    When I give food to the poor they call me a saint. When I ask why they are poor they call me a communist - Archbishop Hélder Câmara

  • Leo Peysakhovich

    Hall of Fame

    Points: 3880

    The size of the database is 5-10G. But remember that there is the whole day to backup and restore because after rename happened the LOAD databases required only for the next load in 24 hours.

  • Hans Lindgren

    SSChampion

    Points: 10494

    I was wondering what your customer thinks when you are killing their connections? I don't think that can be an acceptable solution to a 24/7 warehousing problem in all cases...

  • Sorin Petcu

    Hall of Fame

    Points: 3145

    In fact, I am killing cache database connections made by other third party applications than the master application. So it is ok and my customer is happy 'cause the sql server instance isn't fullfill of unuseful connections

    In Theory, theory and practice are the same...In practice, they are not.
  • Erhan Hosca

    Grasshopper

    Points: 24

    how do you know/verify that the connection(s) you are killing is "cached" ?

  • Leo Peysakhovich

    Hall of Fame

    Points: 3880

    Customers do not see it because connections are coming fro the connection pool and pool reistablish it right after. Worst case scenario that customer must hit refresh button. As I pointed in the article, the traffic is very low at night time and for customer to see the connection being disconnected must be exact time request and kill command comes at exact the same time. Then refresh button hit is required. All the other cases are taking care of by the connection pool.

  • Hans Lindgren

    SSChampion

    Points: 10494

    I guess that is good if you can be sure about that just a refresh would do it.

    I've seen quite a few DWHs where some query runtime gets into the hour range (most time because of the AdHoc attack and under educated users). They schedule their queries somehow (like with a WAITFOR TIME/DELAY) and are pretty angry when they get back in the morning seeing their query being killed.. But then again. They would know and expect this behaviour with a well documented DB switching procedure.

    Good article Leo!

    Regards, Hanslindgren

  • noeld

    SSC Guru

    Points: 96590

    A 10GB data warehouse is small enough to follow the proposed procedure.

    When your data warehouse reaches 200 to 500GB you will get into a TOTALLY different ball game.

    Cheers,


    * Noel

  • Leo Peysakhovich

    Hall of Fame

    Points: 3880

    There is no way for our customers to run their own query. There are scheduled jobs and applications that using stored procs only.

Viewing 15 posts - 1 through 15 (of 30 total)

You must be logged in to reply to this topic. Login to reply