|
|
|
Forum Newbie
      
Group: General Forum Members
Last Login: Thursday, May 19, 2011 8:23 AM
Points: 9,
Visits: 106
|
|
Hi,
We currently maintain a denormalized table which gets updated from 3 tables twice a day. The data in the 3 tables changes a lot and the denormalized table has over 150 Million records. We currently use an SSIS package to gather new data + changed data from the 3 tables and load them to a look up tables and then apply the changes to the denormalized table every 12 hours using batch inserts and deletes. The process runs in about 15 min in a normal run (less than 100000 updated records), but twice a month we run a different process which increase the data volume to about 3 to 4 million records to be denormalized at once, which sometimes takes over 15 hours. The denormalized table currently has over 10 indexes because it gets hit a lot by other processes, and the 10 hour delay that happens occasionally is becoming a major problem right now. I am wondering if there are more efficient ways of maintaining a large denormalized table other than what we currently have. I read about using indexed views, but they are not recommended for tables where data frequently changes. I would appreciate any recommendation.
Thanks, Sean
|
|
|
|
|
Hall of Fame
       
Group: General Forum Members
Last Login: Tuesday, January 29, 2013 10:54 AM
Points: 3,837,
Visits: 3,821
|
|
To improve on your 15 hour process, I think we need to know a bit more about it. Can you give us an explanation of how you are 1) getting the data from your source tables; 2) staging the data; 3) tranformations to denormaize the data - when/where is that happening; 4) how are you inserting into destination table.
Is this all SP driven or are you using SSIS or some other technology? Where is the bulk of the 15 hours spent?
John Rowan
====================================================== ====================================================== Forum Etiquette: How to post data/code on a forum to get the best help - by Jeff Moden
|
|
|
|
|
SSC-Dedicated
           
Group: General Forum Members
Last Login: Yesterday @ 9:57 PM
Points: 32,906,
Visits: 26,790
|
|
15 Hours? It would appear that, in your case, denormalization for reporting puposes is the wrong thing to do. I'd consider a more normalized approach that uses performance tools such as pre-aggregation in sub-queries, well formed indexes, and a healthy dose of best practices for performance.
--Jeff Moden "RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".
First step towards the paradigm shift of writing Set Based code: Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."
For better, quicker answers on T-SQL questions, click on the following... http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following... http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
|
|
|
|