Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

what is the most efficient way to maintain a denormalized table? Expand / Collapse
Author
Message
Posted Monday, March 15, 2010 3:38 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Thursday, May 19, 2011 8:23 AM
Points: 9, Visits: 106
Hi,

We currently maintain a denormalized table which gets updated from 3 tables twice a day. The data in the 3 tables changes a lot and the denormalized table has over 150 Million records. We currently use an SSIS package to gather new data + changed data from the 3 tables and load them to a look up tables and then apply the changes to the denormalized table every 12 hours using batch inserts and deletes. The process runs in about 15 min in a normal run (less than 100000 updated records), but twice a month we run a different process which increase the data volume to about 3 to 4 million records to be denormalized at once, which sometimes takes over 15 hours. The denormalized table currently has over 10 indexes because it gets hit a lot by other processes, and the 10 hour delay that happens occasionally is becoming a major problem right now. I am wondering if there are more efficient ways of maintaining a large denormalized table other than what we currently have. I read about using indexed views, but they are not recommended for tables where data frequently changes. I would appreciate any recommendation.

Thanks,
Sean
Post #883395
Posted Monday, March 15, 2010 3:49 PM
Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Tuesday, October 7, 2014 11:31 PM
Points: 3,840, Visits: 3,847
To improve on your 15 hour process, I think we need to know a bit more about it. Can you give us an explanation of how you are 1) getting the data from your source tables; 2) staging the data; 3) tranformations to denormaize the data - when/where is that happening; 4) how are you inserting into destination table.

Is this all SP driven or are you using SSIS or some other technology? Where is the bulk of the 15 hours spent?




John Rowan

======================================================
======================================================
Forum Etiquette: How to post data/code on a forum to get the best help - by Jeff Moden
Post #883405
Posted Monday, March 15, 2010 9:23 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 1:17 PM
Points: 35,350, Visits: 31,890
15 Hours? It would appear that, in your case, denormalization for reporting puposes is the wrong thing to do. I'd consider a more normalized approach that uses performance tools such as pre-aggregation in sub-queries, well formed indexes, and a healthy dose of best practices for performance.

--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #883511
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse