﻿<?xml version='1.0' encoding='UTF-8'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/"><channel><title>SQLServerCentral / Discuss Content Posted by Andy Leonard / Article Discussions / Article Discussions by Author  / Anatomy of an Incremental Load / Latest Posts</title><generator>InstantForum.NET v2.9.0</generator><description>SQLServerCentral</description><link>http://www.sqlservercentral.com/Forums/</link><webMaster>notifications@sqlservercentral.com</webMaster><lastBuildDate>Mon, 20 May 2013 21:06:56 GMT</lastBuildDate><ttl>20</ttl><item><title>RE: Anatomy of an Incremental Load</title><link>http://www.sqlservercentral.com/Forums/Topic447281-208-1.aspx</link><description>[quote][b]msmithson (12/26/2012)[/b][hr]This was a great article, very straightforward, helpful, and exactly what I needed. Now I can load my data without just truncating and reloading. Thanks!:w00t:[/quote]Thanks msmithson - I am glad it helped!:{&amp;gt;</description><pubDate>Wed, 26 Dec 2012 19:39:33 GMT</pubDate><dc:creator>Andy Leonard</dc:creator></item><item><title>RE: Anatomy of an Incremental Load</title><link>http://www.sqlservercentral.com/Forums/Topic447281-208-1.aspx</link><description>This was a great article, very straightforward, helpful, and exactly what I needed. Now I can load my data without just truncating and reloading. Thanks!:w00t:</description><pubDate>Wed, 26 Dec 2012 17:20:04 GMT</pubDate><dc:creator>msmithson</dc:creator></item><item><title>RE: Anatomy of an Incremental Load</title><link>http://www.sqlservercentral.com/Forums/Topic447281-208-1.aspx</link><description>@AndyThat was simply Amazing explanation about Incremental Load using SSIS.Thanks and Appreciated for all your Help.Bheeshma K</description><pubDate>Mon, 19 Mar 2012 12:15:12 GMT</pubDate><dc:creator>kaalabhairavaa</dc:creator></item><item><title>RE: Anatomy of an Incremental Load</title><link>http://www.sqlservercentral.com/Forums/Topic447281-208-1.aspx</link><description>[quote][b]rockstar283 (6/4/2011)[/b][hr]Guys..what is your opinion about indexes..as per my knowledge indexes are heavily used in OLAP..but I generally disable the indexes before loading new data and again enable them back after loading the data..do you think its a good strategy?[/quote]Indexes can slow inserts. If they cause enough pain, I drop them before the load and then re-apply afterwards.:{&amp;gt;</description><pubDate>Sun, 05 Jun 2011 15:00:22 GMT</pubDate><dc:creator>Andy Leonard</dc:creator></item><item><title>RE: Anatomy of an Incremental Load</title><link>http://www.sqlservercentral.com/Forums/Topic447281-208-1.aspx</link><description>Guys..what is your opinion about indexes..as per my knowledge indexes are heavily used in OLAP..but I generally disable the indexes before loading new data and again enable them back after loading the data..do you think its a good strategy?</description><pubDate>Sat, 04 Jun 2011 12:13:03 GMT</pubDate><dc:creator>rockstar283</dc:creator></item><item><title>RE: Anatomy of an Incremental Load</title><link>http://www.sqlservercentral.com/Forums/Topic447281-208-1.aspx</link><description>Hi Andy, Great post, it really helped me out of a bind! Keep up the good work, as its really hard to find good SSIS tutorial examples. CheersChris</description><pubDate>Thu, 13 Jan 2011 04:46:25 GMT</pubDate><dc:creator>FairFunk</dc:creator></item><item><title>RE: Anatomy of an Incremental Load</title><link>http://www.sqlservercentral.com/Forums/Topic447281-208-1.aspx</link><description>hi andy!above examples are working when primary key column data type is [b]int[/b],[i]but not in case of var char[/i], as i have course in course_dim table,what to do with this case??2)[b]when it found &amp; loaded the new row dwh,next time it again loading that same one along new rows,why?[/b]3)i want to send "new" &amp; changed rows to destination,what to do in this case,describe it too plzhope to hear from you soon :)</description><pubDate>Mon, 13 Dec 2010 11:12:19 GMT</pubDate><dc:creator>mhassanshahbaz</dc:creator></item><item><title>RE: Anatomy of an Incremental Load</title><link>http://www.sqlservercentral.com/Forums/Topic447281-208-1.aspx</link><description>Love your tag :)</description><pubDate>Thu, 24 Jun 2010 11:24:45 GMT</pubDate><dc:creator>epriddy</dc:creator></item><item><title>RE: Anatomy of an Incremental Load</title><link>http://www.sqlservercentral.com/Forums/Topic447281-208-1.aspx</link><description>Solid!:cool:</description><pubDate>Tue, 08 Jun 2010 13:09:44 GMT</pubDate><dc:creator>cyril.capistrano</dc:creator></item><item><title>RE: Anatomy of an Incremental Load</title><link>http://www.sqlservercentral.com/Forums/Topic447281-208-1.aspx</link><description>Hi Cyril,   I encourage you to not only post in the forums - you should also consider writing about your experiences! Steve Jones is always on the lookout for new material and authors.:{&amp;gt; Andy</description><pubDate>Tue, 08 Jun 2010 12:58:34 GMT</pubDate><dc:creator>Andy Leonard</dc:creator></item><item><title>RE: Anatomy of an Incremental Load</title><link>http://www.sqlservercentral.com/Forums/Topic447281-208-1.aspx</link><description>That was the first post I've ever done and I thank you for your useful response. I might just post some more!Thanks!:-)</description><pubDate>Tue, 08 Jun 2010 12:24:40 GMT</pubDate><dc:creator>cyril.capistrano</dc:creator></item><item><title>RE: Anatomy of an Incremental Load</title><link>http://www.sqlservercentral.com/Forums/Topic447281-208-1.aspx</link><description>Hi Cyril,There are use cases where Truncate-and-Load will perform better than an Incremental Load. Keep in mind that SQL Server - and SSIS - performance is impacted by lots of outside factors, starting with network topology and extending to the size and shape of the data itself. The science of Lookups is a field all by itself, and I barely touched on it in this article. In other writing I talk about Change Detection (SQL Server 2008 Change Data Capture is one mechanism) which offers some relief to some incremental loads. Note this will not solve everything all the time - and if I didn't make that clear earlier allow me to do so now: there are some use cases for which Incremental Loading is *not* the answer. You may very well have hit upon one.As an engineer, I always recommend testing to see which will perform best. It sounds as if you have performed these tests and found your best solution.:{&amp;gt; Andy</description><pubDate>Tue, 08 Jun 2010 12:14:58 GMT</pubDate><dc:creator>Andy Leonard</dc:creator></item><item><title>RE: Anatomy of an Incremental Load</title><link>http://www.sqlservercentral.com/Forums/Topic447281-208-1.aspx</link><description>I respectfully disagree that incremental loads are faster than the truncate method. The problem with an incremental load is the amount of resources it uses on the SSIS side to manage joins and lookups. The need for SSIS to load the dataset into memory before it can perform a comparison is great. In addition to the resources required to pull up the data in SQL, the amount of resources needed by the SSIS machine to manage the comparisons may even be greater. Part of this is the need to ensure string comparisons are done with trimmed fields, re-sorting, and data conversions that are performed to ensure the comparisons are accurate. I've yet to see a comment somewhere where the accuracy of the comparisons are put to question but they do occur if these are not taken to consideration. This is due also to the fact that the collation of the SQL server is different from the collation of the Windows Server machine that runs the SSIS package. From my experience, we have processed, logged, validated, and loaded about 60 million records in 2.5 hours using the truncated method (with drop/re-create specific indexes), as opposed to the incremental that takes about 6-8 hours.</description><pubDate>Tue, 08 Jun 2010 11:46:34 GMT</pubDate><dc:creator>cyril.capistrano</dc:creator></item><item><title>RE: Anatomy of an Incremental Load</title><link>http://www.sqlservercentral.com/Forums/Topic447281-208-1.aspx</link><description>There are use cases for SSIS that are not served well by using T-SQL. The example used in the article is not a good use case for SSIS over T-SQL - at least not all of the time. SSIS is suited for moving data between instances of SQL Server on different physical servers or between SQL Server and another database platform. SSIS can also perform better than T-SQL on certain large data transfers between SQL Server databases.As is often the case, the correct answer is "it depends". I encourage developers to test and see which will perform better: T-SQL or SSIS. That's what I do.:{&amp;gt; Andy</description><pubDate>Tue, 08 Jun 2010 06:50:35 GMT</pubDate><dc:creator>Andy Leonard</dc:creator></item><item><title>RE: Anatomy of an Incremental Load</title><link>http://www.sqlservercentral.com/Forums/Topic447281-208-1.aspx</link><description>i prefer to use the merge join transformation combined with a conditional split. For more complex situations use the konesans checksum algorithm transformation to compare staging table vs fact/dimension table.regardsRobert Cooke</description><pubDate>Mon, 07 Jun 2010 23:12:07 GMT</pubDate><dc:creator>rcooke</dc:creator></item><item><title>RE: Anatomy of an Incremental Load</title><link>http://www.sqlservercentral.com/Forums/Topic447281-208-1.aspx</link><description>Um, just ooc - I didn't see any discussion of just doing this in SQL?Why use SSIS at all?</description><pubDate>Mon, 07 Jun 2010 21:45:33 GMT</pubDate><dc:creator>jcraigue@cox.net</dc:creator></item><item><title>RE: Anatomy of an Incremental Load</title><link>http://www.sqlservercentral.com/Forums/Topic447281-208-1.aspx</link><description>Hello Andy, thanks for a great overview using SSIS.  I especially like the detailed explanation with screenshots.  We currently use a stored procedure to handle our comparisons and upserts, once the data is loaded using SSIS.  One of these steps is a stored proc to ensure the lookup tables are populated with any new values before updating the main table.  Any particular thoughts on handling this or would we perform "mini" processes similar to this step?</description><pubDate>Mon, 24 Aug 2009 04:36:48 GMT</pubDate><dc:creator>tskelley</dc:creator></item><item><title>RE: Anatomy of an Incremental Load</title><link>http://www.sqlservercentral.com/Forums/Topic447281-208-1.aspx</link><description>If you have million rows in source table. it may take longer time to do lookup and conditional split comparison than simply batch load all data into destination table</description><pubDate>Thu, 20 Aug 2009 15:46:10 GMT</pubDate><dc:creator>qiyuef</dc:creator></item><item><title>RE: Anatomy of an Incremental Load</title><link>http://www.sqlservercentral.com/Forums/Topic447281-208-1.aspx</link><description>A very interesting article, which I have bookmarked for future reference.  I, too, find SSIS difficult to learn coming from a VB/.NET/T-SQL background.  I ended up building an extensible VB.NET task engine with process-specific task handlers instantiated by reflection and data processing written in T-SQL stored procedures rather than trying to fight with SSIS to handle my very specific external EDI file import, error handling, reprocessing, and reporting requirements.One issue I have is that incremental loads are NOT always faster.  I had one case with one step in an ETL process where an incremental load of the last 30 days of data, with aggregation occurring when selecting from the data source, took 3 minutes (~4,300 rows).  FYI, the process pulls data from a remote server to the data warehouse.I was unhappy with that result, so I tried a TNL (Truncate-N-Load) method for all data from 8/1/06 to present (~128,000 rows).  It took 10 seconds.  The TNL method time blew me away versus the incremental method's time.  It wasn't what I expected, but I guess we live and learn.</description><pubDate>Tue, 18 Aug 2009 06:40:08 GMT</pubDate><dc:creator>jpratt-797544</dc:creator></item><item><title>RE: Anatomy of an Incremental Load</title><link>http://www.sqlservercentral.com/Forums/Topic447281-208-1.aspx</link><description>Hi Andy,good article. I didn't read all the posts, so maybe someone write about it, but did you know TableDifference component ([url=http://www.sqlbi.eu/Default.aspx?tabid=74]http://www.sqlbi.eu/Default.aspx?tabid=74[/url])? I used it a couple of times and it's more easy than the way you show us.</description><pubDate>Mon, 17 Aug 2009 04:48:35 GMT</pubDate><dc:creator>Carlos Sacristan</dc:creator></item><item><title>RE: Anatomy of an Incremental Load</title><link>http://www.sqlservercentral.com/Forums/Topic447281-208-1.aspx</link><description>Hi mescotty22,   This article was originally published in early 2008. The purpose of the article was to introduce and demonstrate one way of loading data incrementally using SSIS. It's not the only way, and this design pattern is not the ultimate solution for all incremental loading strategies. For me, this (or something very similar to this) is the usual starting point.    If this doesn't perform for me as desired, I tinker with the design. How and where I tinker... depends.   There were some really cool suggestions about approaches to making this solution scale. Some can be found earlier in this forum near this post ([url]http://www.sqlservercentral.com/Forums/FindPost491603.aspx[/url]). Some are things I've tried before and some were things I'd never tried.    Everyone is working in different environments. 100 million rows is a lot - sometimes. Data shape comes into play with scale, as does hardware. I currently work on servers with 24 to 96 CPUs and 256 to 1,024 GB RAM. Some 100M row loads on these systems are, frankly, no big deal.    We're all working in different environments with different requirements. My intent with this article was to help folks get started understanding incremental loads with SSIS.:{&gt; Andy</description><pubDate>Sun, 16 Aug 2009 13:00:13 GMT</pubDate><dc:creator>Andy Leonard</dc:creator></item><item><title>RE: Anatomy of an Incremental Load</title><link>http://www.sqlservercentral.com/Forums/Topic447281-208-1.aspx</link><description>Understood Scott. And I think your contribution will likely help me in my work in the future. And I personally feel that all technical examples offered in venues such as this should have volume performance limitations investigated and noted in the article, as this issue can be one of the "deal breakers" for an ETL solution, and an unpleasant disappointment if users make incorrect assumptions.    And as I mentioned previously however, I also expect such performance limitations would NOT be an issue for many ETL problems, although as you have experienced, will be for some.Best,Ken</description><pubDate>Sun, 16 Aug 2009 11:41:57 GMT</pubDate><dc:creator>kenambrose</dc:creator></item><item><title>RE: Anatomy of an Incremental Load</title><link>http://www.sqlservercentral.com/Forums/Topic447281-208-1.aspx</link><description>Ken (and also Jay),I understand that my offering is not applicable in all cases and that some are not concerned about performance.  I am just illustrating alternatives for shops where the performance of a memory hogging Lookup Transformation and all-inclusive source result set will not suffice.I agree that Andy's solution is easier to maintain for the uninitiated, but I am something of a specialist.  I did not want beginners reading the article to think that this is the only or best solution for an incremental sync of environments.Scott WarrenETL Developer</description><pubDate>Sun, 16 Aug 2009 06:58:24 GMT</pubDate><dc:creator>mescotty22</dc:creator></item><item><title>RE: Anatomy of an Incremental Load</title><link>http://www.sqlservercentral.com/Forums/Topic447281-208-1.aspx</link><description>My take on this article was that it was meant to demonstrate some SSIS mechanics for identifying new and changed rows, and, as such, I found it pretty useful, notwithstanding comments about performance.   Jay Taylor</description><pubDate>Sat, 15 Aug 2009 22:18:43 GMT</pubDate><dc:creator>Jay Taylor-604520</dc:creator></item><item><title>RE: Anatomy of an Incremental Load</title><link>http://www.sqlservercentral.com/Forums/Topic447281-208-1.aspx</link><description>Thanks Scott for adding some valuable content re. volume throughput.  One thing I  learned after working in a shop that integrated medical claims records (one om my accounts was 50 Mil. records per batch) is that it is not likely to encounter an ETL "pattern" that will scale well - even using SSIS - unless performance for a large data set has been considered in every step of the processing.Having said that - and in the author's defense -just as often the pain point and majority of costs in ETL processing are in the logic development and maintenance, and not the data volume.  Ken</description><pubDate>Sat, 15 Aug 2009 22:03:18 GMT</pubDate><dc:creator>kenambrose</dc:creator></item><item><title>RE: Anatomy of an Incremental Load</title><link>http://www.sqlservercentral.com/Forums/Topic447281-208-1.aspx</link><description>This is a great solution for the sake of its simplicity.  However, it is not workable for environments with large transactional loads on the source and destination tables.I work at a company that provides real estate marketing solutions.  Our core operational database must be synced to a customized transactional layer that is both accessed by an enterprise gui backend and is replicated to redundant (web-facing drone) copies.This must be done as fast as possible.  I have experimented with the approach you mentioned in the article, but I found it extremely slow.I ended up creating an audit table for incremental, full, and custom syncs.  The logic is to go back to the previous sync date and time and stage the source data locally on the destination server.  Indexes are added, and a single source query renders "new" and "changed" records which are then conditionally redirected in the data flow to an ole db destination and a ole db command respectively.  This approach has virtually eliminated latency between the ODS source and transactional destination (and implicitly the drone destinations).Scott WarrenETL Developer</description><pubDate>Sat, 15 Aug 2009 14:13:19 GMT</pubDate><dc:creator>mescotty22</dc:creator></item><item><title>RE: Anatomy of an Incremental Load</title><link>http://www.sqlservercentral.com/Forums/Topic447281-208-1.aspx</link><description>Hi,A very nice article.I have few things to add for this post1.Why should we bring all the data from the source during incremental load.Why can't we bring the data that is created or modified yesterday2.We have tested and using the approch like we will have a date parametes append to source sql so that it will pull only the rows that has been insterted or modified3.Then we can use SCD or lookup depending uppon the data delta</description><pubDate>Sat, 15 Aug 2009 02:34:34 GMT</pubDate><dc:creator>anilpinnamaneni</dc:creator></item><item><title>RE: Anatomy of an Incremental Load</title><link>http://www.sqlservercentral.com/Forums/Topic447281-208-1.aspx</link><description>maccenon, the editor corrected the missing information recently (thanks Steve!). The entire article should now display.zespri, this article is intended to demonstrate introductory principles of incremental loads. I actually have a script I execute against SQL Server sources that builds the Changed_Rows conditional expression dynamically. You are correct - the value substituted for NULL in the TRUE argument of the ternery conditional cannot appear in the data itself, or you run the risk of a logical collision. Data profiling will mitigate this but when loading human-entered data, all bets are off.casing815, thanks - excellent suggestion. Perhaps I could include information on the kinds of data profiling I do during ETL design.Shon, thanks. You are correct - the load demonstrated in this article could be accomplished using replication. However, I've seen replication used where SSIS should be used more times than I've seen the inverse - especially custom replication.Grant, thanks bro! Whack 'n Load does have the advantage of either failing or delivering an exact copy. That's an important distinction to make. When you filter, you run the risk of having an unwanted row slip through or be unintentionally blocked.casinc815, I've had a few opportunities to convert destructive loads to incremental. In one extreme case on which I merely consulted, the client was able to scale the ETL process 40x within the same maintenance/load &amp;#119;indow. [i]That[/i] was cool!Alvin, durnit! Here's hoping you get the chance. As others have pointed out, this example doesn't exhaust all you'd need to do to gain performance from your load. You would want to pay particular attention to the lookup strategy, and I'd seriously consider implementing a Change Data Capture (CDC) mechanism. Philippe, thanks. I know very little about Oracle but I've heard Oracle has Change Data Capture capabilities. I've built my own CDC in the past, usually implemented with triggers and default constraints. I've seen some implemented in a stored procedure API. The bottom line is you need something doing automatic updates/inserts in which you have complete confidence. I should probably write about this as well - I've used bit columns, int columns, char(1) columns, and LastUpdated/ModifiedDate DateTime columns. The idea is to be able to detect rows that were changed at the source, then you don't have to load those records into the Source Adapter in SSIS (only to throw them away at the Filter Conditional Split). The underlying assumption is that you will be able to alter the schema of the system of record, and this is not an option in many cases.Jay, thanks. It depends [(R)Andy Warren]. If the source is DB2 and the dataset is small enough and won't scale rapidly (that's a lot of conditions...), I use the OLEDB source as shown here. More times than not, I stage data from DB2 and Oracle in a staging SQL Server database. I make the staging database schema as close to the source system schema as possible, and I remove all the constraints from the staging tables. This makes for fast loads that interfere with the source system as little as possible. Plus, I now have the data in a SQL Server database which is, for me, more familiar territory. From there I can load the destination database incrementally. I'd likely stage the data incrementally as well - or transactionally, depending on the requirements of the system.:{&gt; Andy</description><pubDate>Fri, 14 Aug 2009 15:32:41 GMT</pubDate><dc:creator>Andy Leonard</dc:creator></item><item><title>RE: Anatomy of an Incremental Load</title><link>http://www.sqlservercentral.com/Forums/Topic447281-208-1.aspx</link><description>Andy,Great post.What if the source were db2 and the destination were Sql Server. Would you use linked servers for the initial query? Or is there another way?Thanks.</description><pubDate>Fri, 14 Aug 2009 14:23:46 GMT</pubDate><dc:creator>Jay Taylor-604520</dc:creator></item><item><title>RE: Anatomy of an Incremental Load</title><link>http://www.sqlservercentral.com/Forums/Topic447281-208-1.aspx</link><description>Nice article, This seems better than the out of the box SCD.I am intrigued by the comment about SQL2008 able to detect changed rows at the source. It is likely assuming a SQL Server 2008 source?Another idea would be to compare the rows at the source using a distributed query before loading the stage table(s) for update insert deleteFor me, my main source is Oracle, I use several methods, it depends the size, the context and the performance.- For small tables where I do not care about deletes, out of the box SCD is good- For small to medium tables where I need deletes, a solution like Andy's looks good- For large tables I do it in two ways, 1 way involve to put the production table offline for 1ms, the other involves to slow down access to the prod table.Updates and deletes can be slow, and doing all of it in place may require proper handling of locks.Solution 1Load the data i.e. 3 months, into a staging table insert into the staging table the production data older than 3 months Drop the production table Rename the staging table to production rebuild indexes-Solution 2 ( I wish I would know how to hash a natural key, I never found a way that guarantees 100% uniqueness and keep it small)Load the data into stage--   this in procedures called by ssis, use transaction, try catch and add back to update queue if deadlock are a concern. Delete from prod Where key not in (select s.key from s)---- Update p   Set p.val1 = s.val1       p.val2 = s.val2 From Prod p Join Stage s on p.key = s.key Where p.val1 != s.val1OR p.val2 != s.val2---- Insert into prod(val1, val2) Select Val1, val2 From Stage s Join prod p on s.key &lt;&gt; p.key  -- if this join too slow, use subquery</description><pubDate>Fri, 14 Aug 2009 13:37:57 GMT</pubDate><dc:creator>Philippe Cand</dc:creator></item><item><title>RE: Anatomy of an Incremental Load</title><link>http://www.sqlservercentral.com/Forums/Topic447281-208-1.aspx</link><description>Nice article Andy.It makes me want to revisit a recent package where I needed to find the records Added/Changed/Deleted.  Unfortunately I no longer have access to the data.  One table had over 89 million rows.  It would have been a good test of how well the lookup on a large table.</description><pubDate>Fri, 14 Aug 2009 10:52:47 GMT</pubDate><dc:creator>Alvin Ramard</dc:creator></item><item><title>RE: Anatomy of an Incremental Load</title><link>http://www.sqlservercentral.com/Forums/Topic447281-208-1.aspx</link><description>Touche'It would only be used in development of a warehousing project.  Afterr initial load of all data, incremental should be the choice.  I have also run into the opinion that trunc and reload would be preferred.  This example is companies that are managing 50G of less in data.  Maybe this will work fine.  But if you move Gig's daily (nightly) then incremental is truly the way to go.Jim</description><pubDate>Fri, 14 Aug 2009 10:39:34 GMT</pubDate><dc:creator>casinc815</dc:creator></item><item><title>RE: Anatomy of an Incremental Load</title><link>http://www.sqlservercentral.com/Forums/Topic447281-208-1.aspx</link><description>Excellent article, thanks Andy. It's also very well timed for my company. We're having some trouble convincing people that trunc &amp; reload is NOT the industry best practice.</description><pubDate>Fri, 14 Aug 2009 08:00:27 GMT</pubDate><dc:creator>Grant Fritchey</dc:creator></item><item><title>RE: Anatomy of an Incremental Load</title><link>http://www.sqlservercentral.com/Forums/Topic447281-208-1.aspx</link><description>This is a good example that I would consider for use with a non-sql server based implementation.  However, for sql server to sql server implementations, this could be accomplished much easier with replication.</description><pubDate>Fri, 14 Aug 2009 07:55:30 GMT</pubDate><dc:creator>Shon Thompson</dc:creator></item><item><title>RE: Anatomy of an Incremental Load</title><link>http://www.sqlservercentral.com/Forums/Topic447281-208-1.aspx</link><description>Andy:Well done.  Enjoyed the article since I am a Microsoft BI developer/data architect and have gone through much ofwhat you have written.May I suggest for an article, how the source dB is prepared for incremental loads?  Datetime modified, datetime created, SID's or GID's, etc.Great work!  Love this forum!Jim O'Toolecasinc815@hotmail.com</description><pubDate>Fri, 14 Aug 2009 07:29:37 GMT</pubDate><dc:creator>casinc815</dc:creator></item><item><title>RE: Anatomy of an Incremental Load</title><link>http://www.sqlservercentral.com/Forums/Topic447281-208-1.aspx</link><description>[quote][b]Andy Leonard (2/11/2008)[/b] As I mentioned in the article, "[i]NULL-handling in SSIS Expression Language could darn-near rate its own article![/i]"[/quote]We ended up doing something like this:(ISNULL([Target.RebateDay] != [Old.RebateDay]) ? !(ISNULL([Target.RebateDay]) &amp;&amp; ISNULL([Old.RebateDay])) : [Target.RebateDay] != [Old.RebateDay]) we have to repeat this for every nullable column which is a tad ugly. Thankfully in our case there are not many nullable columns.Your approach works with strings, but a) there are other data types and b) what if values change from null to empty string and back on a regular basis on a large percentage of rows?PS. I haven't read the whole thread, my apologies if this or the issue above has already been addressed.</description><pubDate>Fri, 14 Aug 2009 02:38:53 GMT</pubDate><dc:creator>zespri</dc:creator></item><item><title>RE: Anatomy of an Incremental Load</title><link>http://www.sqlservercentral.com/Forums/Topic447281-208-1.aspx</link><description>Beware that this approach doesn't seem to work well with big tables. If you have many millions of rows and a couple dozen columns, you will run out of memory, as lookup transformation pre-caches the whole dataset. It is possible to configure partial cache/no cache, but in this case execution speed degrades to a crawl.On the other hand if a table is under a million records and you don't have that many columns that can change in an incremental load, then this approach does work very reliably.</description><pubDate>Fri, 14 Aug 2009 02:31:59 GMT</pubDate><dc:creator>zespri</dc:creator></item><item><title>RE: Anatomy of an Incremental Load</title><link>http://www.sqlservercentral.com/Forums/Topic447281-208-1.aspx</link><description>Hi,Nice article it helps me a lot so thanks!But :( I think that there is something missing beetwen [b]"Write: Changed Rows"[/b] and the [b]"Conclusion"[/b]. Even numbers of images aren't correct. Could someone look on it.Thanks !</description><pubDate>Thu, 16 Jul 2009 07:38:39 GMT</pubDate><dc:creator>maccenon</dc:creator></item><item><title>RE: Anatomy of an Incremental Load</title><link>http://www.sqlservercentral.com/Forums/Topic447281-208-1.aspx</link><description>I've implemented the Anatomy of Incremental load the same was as it was illustrated. The package was working for 2 months. The  ETL SSIS package that loads a dimension and two fact tables. The package has been running successfully for the past two months. But started failing last week. It runs perfectly in the development server but fails on production.I did some troubleshooting that found that it only fails when trying to load the final fact table and followed some recommendations that were posted but kept getting similar errors” [7131]: SSIS Error Code DTS_E_OLEDBERROR.  An OLE DB error has occurred. Error code: 0x8007000E. Error: 0xC020204A at fact_contract_count incremental load, contract count exception [7131]: Unable to retrieve column information from the data source. Make sure your target table in the database is available. Error: 0xC004701A at fact_contract_count incremental load, DTS.Pipeline: component "contract count exception" (7131) failed the pre-execute phase and returned error code 0xC020204A.  Error: 0xC0047031 at fact_contract_count incremental load, DTS.Pipeline: The Data Flow task failed to create a required thread and cannot begin running. The usually occurs when there is an out-of-memory state. Error: 0xC004700E at fact_contract_count incremental load, DTS.Pipeline: The Data Flow task engine failed at startup because it cannot create one or more required threads. I completely disabled the email and error event handlers and I keep getting the same error or similar errors. Also tried set the AlwaysUseDefaulCodePage Custom Property of the OLE DB Source control to True. And did the same for the destination but I get similar errors. In addition, I even set the delay validation to true. But didn’t help.</description><pubDate>Tue, 10 Mar 2009 13:28:24 GMT</pubDate><dc:creator>dbdeli</dc:creator></item><item><title>RE: Anatomy of an Incremental Load</title><link>http://www.sqlservercentral.com/Forums/Topic447281-208-1.aspx</link><description>Hi dbdeli,   Since you're checking for a Null and all columns returned will be Null for a new row (a row where the Left Outer Join fails to find a match), you can simply pick any one of them for the test (given your join is defined as ON all four columns with AND operations).:{&amp;gt; Andy</description><pubDate>Wed, 07 Jan 2009 14:38:23 GMT</pubDate><dc:creator>Andy Leonard</dc:creator></item></channel></rss>