﻿<?xml version='1.0' encoding='UTF-8'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/"><channel><title>SQLServerCentral / Article Discussions / Article Discussions by Author / Discuss content posted by Adam Aspin  / A faster way to prepare dimensional databases / Latest Posts</title><generator>InstantForum.NET v2.9.0</generator><description>SQLServerCentral</description><link>http://www.sqlservercentral.com/Forums/</link><webMaster>notifications@sqlservercentral.com</webMaster><lastBuildDate>Thu, 23 May 2013 09:31:35 GMT</lastBuildDate><ttl>20</ttl><item><title>RE: A faster way to prepare dimensional databases</title><link>http://www.sqlservercentral.com/Forums/Topic871671-1497-1.aspx</link><description>Thanks Dave for your suggestions.</description><pubDate>Wed, 22 Dec 2010 01:20:32 GMT</pubDate><dc:creator>itdevnews</dc:creator></item><item><title>RE: A faster way to prepare dimensional databases</title><link>http://www.sqlservercentral.com/Forums/Topic871671-1497-1.aspx</link><description>Nice article.Thanks for taking the time to put it together and share with us.</description><pubDate>Tue, 21 Dec 2010 12:33:19 GMT</pubDate><dc:creator>JasonRowland</dc:creator></item><item><title>RE: A faster way to prepare dimensional databases</title><link>http://www.sqlservercentral.com/Forums/Topic871671-1497-1.aspx</link><description>I done the MCITP which was quite good but I did find at the time there was a bit of a gap between what the book/course teaches you and what was in the exam. It's worth doing, but you'll need some pratical experience as well.</description><pubDate>Mon, 20 Dec 2010 03:10:19 GMT</pubDate><dc:creator>dave-dj</dc:creator></item><item><title>RE: A faster way to prepare dimensional databases</title><link>http://www.sqlservercentral.com/Forums/Topic871671-1497-1.aspx</link><description>Thank you for this great article. I am migrating my skills from relational database development to dimensional modeling. The transition is slightly non straightforward since I have been working with OLTP databases for years now. One book I have found really helpful and would like to share is "Pro SQL Server.2008 Analysis Services".Does anyone have any recommendations as to:1) Any IT certifications exams worth sitting for. Is 70.448 worth the time and effort?.2) Any English language/distance learning thought MSc degrees. SQL Server based rather than Oracle. 3) Seeking helpful resources for beginners/intermediate level.  </description><pubDate>Sun, 19 Dec 2010 06:26:05 GMT</pubDate><dc:creator>itdevnews</dc:creator></item><item><title>RE: A faster way to prepare dimensional databases</title><link>http://www.sqlservercentral.com/Forums/Topic871671-1497-1.aspx</link><description>Interesting article,althoug I read it only after it was reposted.In these 'upserts', we usually rely on the MERGE statement.  As it is written in hand-tuned Assembler, its speed is IMO far superior to any custom-written T-SQL code.</description><pubDate>Fri, 17 Dec 2010 19:37:24 GMT</pubDate><dc:creator>Revenant</dc:creator></item><item><title>RE: A faster way to prepare dimensional databases</title><link>http://www.sqlservercentral.com/Forums/Topic871671-1497-1.aspx</link><description>Nice article, thanks!We have found the same thing in our process, that it is faster to completely rebuild with an INSERT INTO a blank table, than it is to try to do incremental updates. (UPDATE just takes too long with the number of changes we have, even if you batch it.)The one thing that worries me in your example is that you assign the surrogate key based on the description, i.e. ClientName, what happens if you have two clients with the same name? Why don't you build the surrogate keys based on what I assume is the business key, i.e. ClientID? (It is less likely to be an issue with the Geography or Products dimensions, but removing the multi-column joins would probably speed things up as well.)</description><pubDate>Fri, 17 Dec 2010 15:59:05 GMT</pubDate><dc:creator>UMG Developer</dc:creator></item><item><title>RE: A faster way to prepare dimensional databases</title><link>http://www.sqlservercentral.com/Forums/Topic871671-1497-1.aspx</link><description>A good article - that in severals ways reminds my of a process at my company for building the cube - but NOT populating our datamart ("normalized" and dimensional databases). So the cube is totally rebulit on a nightly basis - but more and more data in the underlying datamart makes this process longer and longer, and we are in the process of re-designing this for a delta-load method (changes only).In my opinion your process has a few drawbacks - that can turn out to make this kind of ETL-solution a no go!First of all - and mentioned by others in this thread - your ability to create good execution plans.If you always drop and recreate your tables you stand to loose all your SQL Server statistics for those table - and hence data for creating good executions plans, correct indexing, parallelism and more.Unless you know exactly how the users and/or applications consume the data, ie. know all queries and their variations and have tested them, you will maybe gain good performance in loading data, but most all reads on data afterwards will - due to lack of statistics and good executions plans - not perform well. Of cource if no users has direct access to the datamart and only the cube, AND you have optimized all your processes using previous executions plans - and data has not changed structure or keys - then this is not a real problem.In my company we perform quite a bit of "ad hoc" queries on data when tracking problem, analyzing, modelling and more - so having no statistics will hurt us BAD! WE have recognized that we do NOT - and cannot anticipate - how all the users contruct their queries and make "crazy" but repeatable selections. So we need our statistics to much to do "DROP TABLE - SELECT INTO".A much better process - but in overall theory equally well performing process - would be to use "BULK LOGGED on the database as a logging mechanism - TRUNCATE TABLE - and BULK INSERT as a data maintaining mechanism" (actually you should do a "UPDATE STATS" after the loading). Here you get the good performance in the loading process - and don't loose your statistics - so you can build (REBUILD) the correct indexes, and have up-to-date executions plans making the overall access of data perform faster.Another drawback of the "SELECT INTO" is that it locks your system tables for the duration of the insert, thus it can escalate to serious locking issues in your loading process. So parallel "SELECT INTO" processes can result in deadlocks or timeouts - you can only use a sequential load of tables to be "safe".Also - if your have a non-optimal I/O subsystem, then having to write all those data over and over each night will surely at some point give you a performance bottleneck. Only delta-loads (not ful loads) of data can help you here - and using hash-values (MD5 or the like) for each row, checking that data is updated on a row basis, together with MERGE processes to do INSERT, UPDATE, DELETE and building slowly changing dimensions, will work.The hash-method will of course create some more activity on your cpu's - and you have to read from disk to compare, but this should be not any serious source of performance problems. If so then any read process will be slow a the disk subsystem should be changed or upgraded.Even though I like the simplicity of your solution - it has - in my opinion too many unknown or un-estimated performance issues connected to it. So I would not ever use this process on a whole database (I do use a similar process on a few distinct tables, due to being "lazy" in the design process).Regards,Claus T. Madsen</description><pubDate>Fri, 17 Dec 2010 13:10:18 GMT</pubDate><dc:creator>clausm73</dc:creator></item><item><title>RE: A faster way to prepare dimensional databases</title><link>http://www.sqlservercentral.com/Forums/Topic871671-1497-1.aspx</link><description>Pretty good article. Few things to remember about building DWs, one of which it seems you've already learned:1. Avoid updates where possible. Inserts are many times faster. (which leads to point 2)2. Build Dimensions then Fact tables. (Avoid updates)3. Building SCDs relieves the need to update records. Also, it makes a better historical view. I try to make all my dimensions SCDs.4. MERGE works well for dimensions for which history is neither required or desired. [url=http://technet.microsoft.com/en-us/library/cc879317.aspx]You can get some performance gains if you spend time tweaking your MERGE.[/url]5. Avoid deleting info (or truncating). There are some cases where this is unavoidable, but try SCD or MERGE instead where possible. (unless this is causing the dim to grow out of control - like millions of records)Also, correct me if I'm wrong, if you use SELECT INTO doesn't that make it hard for the optimizer to generate a good plan? Because it doesn't know what the table will look like before hand. That's what I was told a while back by a performance guru. If this is true, probably best to keep all your perm and stage tables around between ETL sessions. *EDIT* I think I remember him saying best to avoid this because the sproc must be recompiled every time it's run.</description><pubDate>Fri, 17 Dec 2010 09:46:09 GMT</pubDate><dc:creator>amenjonathan</dc:creator></item><item><title>RE: A faster way to prepare dimensional databases</title><link>http://www.sqlservercentral.com/Forums/Topic871671-1497-1.aspx</link><description>Thanks for the details in the article, I am gald I read through the whole article and comments The comments helped the discussion topic and paths to loading a Data Mart/Warehouse.I am sure hopeful those with other options will write articles based on their experience.Steve will publish them if they are as well written as this one.Thomas LeBlancTheSMilingDBA</description><pubDate>Fri, 17 Dec 2010 08:59:34 GMT</pubDate><dc:creator>Thomas LeBlanc</dc:creator></item><item><title>RE: A faster way to prepare dimensional databases</title><link>http://www.sqlservercentral.com/Forums/Topic871671-1497-1.aspx</link><description>[quote][b]dave-dj (12/17/2010)[/b][hr][quote]For example, my understanding is BIDS Slowly Changing Dimension transformation is a performance hound for large tables.[/quote]FAO tskelly:If your finding the standard SCD component of SQL slow, have a look at Todd McDermitts Kimball SCD component.I've switched to using his Kimball SCD for a number of reasons. First of all it's faster.  As mentioned in the details, it doesn't destroy the dataflow when you need to make changes.  Also, it provide a wealth of auditing information.  Well worth a look!You can find it here: http://kimballscd.codeplex.com/[/quote]We use Todds component as well, it has a few performance issues with large datasets, we did some alpha testing the v1.6 of the component and that seems to have fixed some of the perfomance issues, unfortuantely we're just waiting for the final release.</description><pubDate>Fri, 17 Dec 2010 03:57:55 GMT</pubDate><dc:creator>Jason-299789</dc:creator></item><item><title>RE: A faster way to prepare dimensional databases</title><link>http://www.sqlservercentral.com/Forums/Topic871671-1497-1.aspx</link><description>Thank you Adam.At first it is almost necessary to have control of the the whole "ETL" process. Missing data orbad quality is major problem in many cases. You can seldom avoid outer joins. I use ISNULL(expression,'MISSING') and have a default dimension member MISSING in most of the dimensions. Your approach is also usefull in "non OLAP" situations like with the new feature PowerPivot. We do have dimension tables and one or more fact tables to import. PowerPivot then creates the "CUBE" by it self.Gosta M</description><pubDate>Fri, 17 Dec 2010 03:28:14 GMT</pubDate><dc:creator>Gosta Munktell</dc:creator></item><item><title>RE: A faster way to prepare dimensional databases</title><link>http://www.sqlservercentral.com/Forums/Topic871671-1497-1.aspx</link><description>[quote]For example, my understanding is BIDS Slowly Changing Dimension transformation is a performance hound for large tables.[/quote]FAO tskelly:If your finding the standard SCD component of SQL slow, have a look at Todd McDermitts Kimball SCD component.I've switched to using his Kimball SCD for a number of reasons. First of all it's faster.  As mentioned in the details, it doesn't destroy the dataflow when you need to make changes.  Also, it provide a wealth of auditing information.  Well worth a look!You can find it here: http://kimballscd.codeplex.com/</description><pubDate>Fri, 17 Dec 2010 03:25:12 GMT</pubDate><dc:creator>dave-dj</dc:creator></item><item><title>RE: A faster way to prepare dimensional databases</title><link>http://www.sqlservercentral.com/Forums/Topic871671-1497-1.aspx</link><description>[code="sql"]INSERT INTO &amp;lt;table&amp;gt; WITH (TABLOCK)[/code]will minimally log in the same way [code="sql"]SELECT .... INTO[/code] will by default.</description><pubDate>Fri, 17 Dec 2010 02:33:10 GMT</pubDate><dc:creator>hallidayd</dc:creator></item><item><title>RE: A faster way to prepare dimensional databases</title><link>http://www.sqlservercentral.com/Forums/Topic871671-1497-1.aspx</link><description>Its a good article.However, did you experience any problems with datatypes being incorrectly assigned using the SELECT..INTO.. method, I notice you mention that nulls are handled in the consolidation layer so it may mitigate the issue.I'm also not sure how this can be adapted to an incremental DW unless you are not concerned about maintaining the history, as by dropping the fact table you loose the ability to maintain the historic Fact record, and thus you surely defeat the object one of the objectives of an incremental, which is to allow you to perform point in time analysis, which can be quite critical for DW's in the Finance industry.</description><pubDate>Fri, 17 Dec 2010 01:46:22 GMT</pubDate><dc:creator>Jason-299789</dc:creator></item><item><title>RE: A faster way to prepare dimensional databases</title><link>http://www.sqlservercentral.com/Forums/Topic871671-1497-1.aspx</link><description>Thanks for the article.</description><pubDate>Fri, 17 Dec 2010 00:08:18 GMT</pubDate><dc:creator>SQLRNNR</dc:creator></item><item><title>RE: A faster way to prepare dimensional databases</title><link>http://www.sqlservercentral.com/Forums/Topic871671-1497-1.aspx</link><description>Fabulous! I look forward to reading it! Adam</description><pubDate>Fri, 26 Feb 2010 01:51:21 GMT</pubDate><dc:creator>Adam Aspin</dc:creator></item><item><title>RE: A faster way to prepare dimensional databases</title><link>http://www.sqlservercentral.com/Forums/Topic871671-1497-1.aspx</link><description>We don't need no stinkin' changing dimension widgets...  Took me a while with the article-submitting editor, but I got my treatise on using MD5 submitted.  We'll see how fast it gets rejected.</description><pubDate>Thu, 25 Feb 2010 23:02:37 GMT</pubDate><dc:creator>magarity kerns</dc:creator></item><item><title>RE: A faster way to prepare dimensional databases</title><link>http://www.sqlservercentral.com/Forums/Topic871671-1497-1.aspx</link><description>[quote][b]magarity kerns (2/25/2010)[/b][hr]No, I've never had a problem with using md5.  I'm inspired to write my own article on how to use md5 in an update/insert strategy for loading data warehouses if I can get Steve the admin to accept it.[/quote]By all means, please write away.  We currently truncate and replace, but would be extremely interested in a more incremental strategy.  For example, my understanding is BIDS Slowly Changing Dimension transformation is a performance hound for large tables.</description><pubDate>Thu, 25 Feb 2010 12:48:21 GMT</pubDate><dc:creator>tskelley</dc:creator></item><item><title>RE: A faster way to prepare dimensional databases</title><link>http://www.sqlservercentral.com/Forums/Topic871671-1497-1.aspx</link><description>No, I've never had a problem with using md5.  I'm inspired to write my own article on how to use md5 in an update/insert strategy for loading data warehouses if I can get Steve the admin to accept it.</description><pubDate>Thu, 25 Feb 2010 12:16:32 GMT</pubDate><dc:creator>magarity kerns</dc:creator></item><item><title>RE: A faster way to prepare dimensional databases</title><link>http://www.sqlservercentral.com/Forums/Topic871671-1497-1.aspx</link><description>Ok, great!Sorry if I misunderstood.</description><pubDate>Thu, 25 Feb 2010 02:58:15 GMT</pubDate><dc:creator>amritpal.parmar</dc:creator></item><item><title>RE: A faster way to prepare dimensional databases</title><link>http://www.sqlservercentral.com/Forums/Topic871671-1497-1.aspx</link><description>Hi Amrit,This is explained in my reply to Magnus, above - the design is in a considerable state of flux, and we wnated complete teardown on every process run, to guarantee coherence and to avoid having "old" data cluttering up the DW.Sorry that this wasn't clear in the article.Otherwise yes - standard techniques could (and probably will) be used.Regards,Adam</description><pubDate>Thu, 25 Feb 2010 02:57:04 GMT</pubDate><dc:creator>Adam Aspin</dc:creator></item><item><title>RE: A faster way to prepare dimensional databases</title><link>http://www.sqlservercentral.com/Forums/Topic871671-1497-1.aspx</link><description>Hi,thanks for the article and example.what I don't understand and this may be as you don't have complete control of the ETL or you're restricted to on the design, is why do you truncate the Dimensions? could you not just relate this to a Business Key and do appropriate SCD 1 / SCD 2 transformations?Even with the Fact table, you could just insert for new records and update for existing records.Regards,Amrit</description><pubDate>Thu, 25 Feb 2010 02:24:59 GMT</pubDate><dc:creator>amritpal.parmar</dc:creator></item><item><title>RE: A faster way to prepare dimensional databases</title><link>http://www.sqlservercentral.com/Forums/Topic871671-1497-1.aspx</link><description>For Dbowlin:There is a good MD5 checksum approach here:http://www.tek-tips.com/viewthread.cfm?qid=1268144&amp;page=1</description><pubDate>Thu, 25 Feb 2010 02:03:36 GMT</pubDate><dc:creator>Adam Aspin</dc:creator></item><item><title>RE: A faster way to prepare dimensional databases</title><link>http://www.sqlservercentral.com/Forums/Topic871671-1497-1.aspx</link><description>Hi Magarity,Absolutely the way to go once the process is stable and populated on an incremental basis.Tell me, I gather from the discussions on the subject that there is only a 1 in 2^40 chance of a duplicate MD5 checksum (as opposed to the SQL Server CHECKSUM function) - have you ever experienced problems with checksums?Thanks,Adam</description><pubDate>Thu, 25 Feb 2010 01:59:51 GMT</pubDate><dc:creator>Adam Aspin</dc:creator></item><item><title>RE: A faster way to prepare dimensional databases</title><link>http://www.sqlservercentral.com/Forums/Topic871671-1497-1.aspx</link><description>Hi Centexbi,This sounds really interesting - I will give it a try as soon as I can.Thanks for the idea!Adam</description><pubDate>Thu, 25 Feb 2010 01:55:44 GMT</pubDate><dc:creator>Adam Aspin</dc:creator></item><item><title>RE: A faster way to prepare dimensional databases</title><link>http://www.sqlservercentral.com/Forums/Topic871671-1497-1.aspx</link><description>Hi There Casinc835,Thanks for the input!After 60 -odd days and a 10% increase in volumes (and admittedly a lot of design changes) we are not seeing any notable increases in processing times, so it seems pretty linear.Regards,Adam</description><pubDate>Thu, 25 Feb 2010 01:54:12 GMT</pubDate><dc:creator>Adam Aspin</dc:creator></item><item><title>RE: A faster way to prepare dimensional databases</title><link>http://www.sqlservercentral.com/Forums/Topic871671-1497-1.aspx</link><description>Hi Magnus,Of course, you are absolutely right that your approach is faster, but, as you point out, it reauires complete control of the entire process. Also (which I did not make clear enough in the article, so please excuse me), the appraoch that I am suggesting is very much a development technique. Our data warehouse is evolving constantly, and dimensions and attributes are being altered all the time (this is one of those projects where "Agile" means less design and planning than is healthy...) so the design is not stable enough to persist dimesnion attributes - yet.Once we reach a more stable setup, then we will certainly adapt the process to persist unchanging dimension and fact data, and adopt the most appropriate techniques which suit the project.Thanks for the very succinct and clear description of how to go about doing this!Adam</description><pubDate>Thu, 25 Feb 2010 01:51:16 GMT</pubDate><dc:creator>Adam Aspin</dc:creator></item><item><title>RE: A faster way to prepare dimensional databases</title><link>http://www.sqlservercentral.com/Forums/Topic871671-1497-1.aspx</link><description>Hi Joel,Thanks for this idea - but can you confirm that processing large data sets in batches like this is faster overall for the entire dataset? My (admittedly hazy) memory of this approach is that the overall time taken is around the same?THanks for the input!Adam</description><pubDate>Thu, 25 Feb 2010 01:44:52 GMT</pubDate><dc:creator>Adam Aspin</dc:creator></item><item><title>RE: A faster way to prepare dimensional databases</title><link>http://www.sqlservercentral.com/Forums/Topic871671-1497-1.aspx</link><description>Where can I find out more about the MD5 checksum?</description><pubDate>Wed, 24 Feb 2010 16:29:06 GMT</pubDate><dc:creator>Daniel Bowlin</dc:creator></item><item><title>RE: A faster way to prepare dimensional databases</title><link>http://www.sqlservercentral.com/Forums/Topic871671-1497-1.aspx</link><description>One technique that can be used is to keep all the surrogate keys in a separate fact extension table with the fact id. Hence you are either updating a very narrow fact table or inserting to it. We always use views for dimensions and fact tables as input to an SSAS cube. That way, you can join the the two fact tables in a view to get the surrogate keys deployed as well.</description><pubDate>Wed, 24 Feb 2010 15:44:47 GMT</pubDate><dc:creator>centexbi</dc:creator></item><item><title>RE: A faster way to prepare dimensional databases</title><link>http://www.sqlservercentral.com/Forums/Topic871671-1497-1.aspx</link><description>Very impressive article!I'm just now being asked to transition to some DW work after eons of doing primarily OLTP support.I'm bookmarking this article!Thanks so much for your efforts.</description><pubDate>Wed, 24 Feb 2010 13:59:25 GMT</pubDate><dc:creator>SQLBOT</dc:creator></item><item><title>RE: A faster way to prepare dimensional databases</title><link>http://www.sqlservercentral.com/Forums/Topic871671-1497-1.aspx</link><description>The final refinement of the SK lookup table method comes when you add an MD5 checksum column and leave them permanently populated.  Use these tables to quickly determine update/insert for only changed or new records.  You can even make one of these for the fact table itself.  Then fast incremental loads for the entire process become possible and your times will be cut dramatically.  I've dealt with source systems that put out flat file dumps of 500+ million rows daily and used this method to process the deltas in barely an hour.</description><pubDate>Wed, 24 Feb 2010 12:51:56 GMT</pubDate><dc:creator>magarity kerns</dc:creator></item><item><title>RE: A faster way to prepare dimensional databases</title><link>http://www.sqlservercentral.com/Forums/Topic871671-1497-1.aspx</link><description>"Insert Into" with a large amount of records will run a long time because SQL Server does the whole insert in one transaction. Use a data pump (Data Flow Task) and set the properties to write the transaction every 5000 records (set both "Rows per Batch" and "Maximum insert commit size" to 5000). You will save hours on your loads. You don't really care about backing out the whole transaction because if it fails, you will most likely go back to the truncate and start over again.</description><pubDate>Wed, 24 Feb 2010 10:25:55 GMT</pubDate><dc:creator>Joel Srebnick</dc:creator></item><item><title>RE: A faster way to prepare dimensional databases</title><link>http://www.sqlservercentral.com/Forums/Topic871671-1497-1.aspx</link><description>Adam,Great article!  Very succinct and gets to the point about how important a DW and DM coexist together.  My question is about growth.  How is the process working after say 30 days and 180 days?  What has the increased volume done to performance?  Once again, great Job.Jim</description><pubDate>Wed, 24 Feb 2010 08:28:17 GMT</pubDate><dc:creator>casinc815</dc:creator></item><item><title>RE: A faster way to prepare dimensional databases</title><link>http://www.sqlservercentral.com/Forums/Topic871671-1497-1.aspx</link><description>Very good article.  I am working in a very similar situation.  Eventually I will gain more control over the ETL process.  But I took a couple of good ideas from this article that will definitely help me in the short term.  Thanks.</description><pubDate>Wed, 24 Feb 2010 08:06:02 GMT</pubDate><dc:creator>Daniel Bowlin</dc:creator></item><item><title>RE: A faster way to prepare dimensional databases</title><link>http://www.sqlservercentral.com/Forums/Topic871671-1497-1.aspx</link><description>This is one of the best nuts and bolts articles I've come across in SSAS.Question for the Group:Is there a "test" dataset available for creating an SSAS Dimensional Database?  All of the books, articles and commentary tell you to load the Adventure Works Cube and move on...this is a bit like putting the cart before the horse...it is already a completed solution.The dataset ideally would have semi-dirty data that requires some cleanup before it can be loaded into a dimensional DB...something that is commonplace everywhere except in AdventureWorks.Please correct me if I am off the mark on AdventureWorks DW.</description><pubDate>Wed, 24 Feb 2010 07:24:36 GMT</pubDate><dc:creator>Martin Vrieze</dc:creator></item><item><title>RE: A faster way to prepare dimensional databases</title><link>http://www.sqlservercentral.com/Forums/Topic871671-1497-1.aspx</link><description>actually i was waiting for a loooooooong time to read this type of article to get basic ideas, really very good article, thanks a lot</description><pubDate>Wed, 24 Feb 2010 02:58:26 GMT</pubDate><dc:creator>chenthilar</dc:creator></item><item><title>RE: A faster way to prepare dimensional databases</title><link>http://www.sqlservercentral.com/Forums/Topic871671-1497-1.aspx</link><description>Although I'm glad to learn new ways of loading an multidimensional database structure, I do feel the solution to be a bit cumbersome.This could of course be due to that you dont have control over the complete ETL-chain.Our dw also has normalized database and multidimensional database. The norm is loaded daily from 20 different sources and we perform incremental updates of the two main fact tables, Sales and GL. From textfile to updated transaction table, given about 3 million updated rows in each table every night, we have a processing time of about 30 mins.The logic has the following bases1. Norm dim tables create and hold surrogate keys for items, accounts etc. This means surrogate keys are fixed and not recreated every night2. Norm fact tables have surrogate keys for; a: every dim member on the row, b: for the fact row itself3. All norm tables have an "LatestUpdated"-flag which is set when a nightly load updates the row4. Each load step is logged, saving the "latest succesful run data"-flagThe dw update then do this1. Truncate and reload of all dimension tables. This works since norm keep all surrogate keys2. Incremental update of the Fact table using a comparison of "LatestUpdated" and "LastSuccessfulRun". Minimum rows are touched and surrogate keys are already in place3. Reports tables/cube loads follow...I think this approach seem simpler and less error prone. This of course require that you have control of how the norm database is loaded.</description><pubDate>Wed, 24 Feb 2010 02:32:42 GMT</pubDate><dc:creator>Magnus Hagdahl</dc:creator></item><item><title>A faster way to prepare dimensional databases</title><link>http://www.sqlservercentral.com/Forums/Topic871671-1497-1.aspx</link><description>Comments posted to this topic are about the item [B]&lt;A HREF="/articles/Analysis+Services/69343/"&gt;A faster way to prepare dimensional databases&lt;/A&gt;[/B]</description><pubDate>Tue, 23 Feb 2010 22:05:12 GMT</pubDate><dc:creator>Adam Aspin</dc:creator></item></channel></rss>