How Small A Use Case Is It?
For myself - pretty darn small. I can only assume it's also fairly "edge" in general out there as well. In my specific situation, I'm attempting to load 22 million rows into a fact table. At the point in the data flow where this problem crops up, the row is 88 bytes wide (not counting overhead). I'm running this on my laptop with 8GB of RAM, over 4GB "free". My data flow should only take about 2GB of RAM to process (again - not counting overhead, and being VERY simplistic). I should have enough headroom.
However, I am doing something fairly interesting in the data flow that I've described in a partially-finished blog entry that's been unfinished for months and months. I'm allocating a total quantity over thousands of rows, and want to make sure I have no rounding errors in the allocation. In other words, I'm splitting a total dollar amount amongst a lot of rows according to a proportional formula, and I want to make sure it all adds back up to the original amount.
I'll leave it to that unfinished blog post I'll publish in the future to describe in detail how and why I go about doing it the way I do. But I have to describe it briefly to demonstrate the issue. It involves using a conditional split to funnel off a small percentage of rows (about 3%) from the main flow. The intent is to add/subtract whatever "remainder" or rounding error that occurred in the allocation calculation(s) to be adjusted on those rows. It also involves a Sort, Aggregate, and Merge Join - a lot of blocking and partially blocking components - just the kind of thing that usually clogs up memory - but doesn't in the way you'd think in this case.
More Diagnostic Details
Well - the errors you (and I) get seem to indicate that SSIS is failing for no reason. Here are verbatim errors from my output window:
Error: 0xC0047012 at Calculate Facts and Record Years: A buffer failed while allocating 79576 bytes.
Error: 0xC0047011 at Calculate Facts and Record Years: The system reports 68 percent memory load. There are 8577282048 bytes of physical memory with 2730938368 bytes free. There are 8796092891136 bytes of virtual memory with 8790742302720 bytes free. The paging file has 9099673600 bytes with 11153408 bytes free.
"68%" usage on an 8GB system, and it's complaining about not having enough memory? Seriously? 2.7GB physical memory "free", and there's still a buffer allocation failure?
I'm definitely no expert on analyzing what's going on, but I did break out the few tools I fumble around with. First, Task Manager. When I watch Task Manager's processes tab during a BIDS execution, I see that DTSDebugHost is (of course) doing all the work. The default columns shown in Task Manager include a "Memory" column - but it never gets very large - not even close to the 2GB I think should see if my pipeline is stuffed with 22 million rows. If you run the second tool, "perfmon /res" (I highly recommend it!) and watch memory usage as the data flow runs, you'll see a TON of "free memory" - even as SSIS (and probably windows itself, with little popup messageboxes like at the top of this article) is telling you that you're running out of memory. A little discomforting, to say the least...
If you change the columns you're looking at in Task Manager to add the "Memory (Commit Charge)" one - you'll now see a more accurate picture that matches up with the complaints you're getting from Windows and SSIS about memory pressure. In my flow execution, the Private Working Set doesn't exceed 500MB, but the Commit Charge climbs above 4GB!
What I Believe Is Happening
There's got to be a reason for this "excessive" commit charge, while actual memory "usage" is quite low, doesn't there? I can't explain it exactly, but my hunch is that it's due to memory fragmentation. Your package may not be actively holding on to that many bytes in memory, but if those bytes are sparsely used in a larger block of memory, then Windows may not be able to reclaim it for other processes (or itself).
It was that thought of "sparse usage" that made me look more closely at the Conditional Split I was using. For those of you that understand the Data Flow a little deeper than most - and you should if you read my blog - the Conditional Split doesn't make copies of data, or move data into two separate buffer stacks. It handles "splits" much more efficiently than that. Well, I should say efficiently with respect to CPU cycles and memory copying... because in our case, this efficiency is part of the problem. The Conditional Split adds a column to your data flow that only the data flow engine can see. (You can't - don't try.) Simplistically, this column is filled with the name of the output that row is supposed to flow out of. Components attached to Conditional Split outputs get ALL the buffers with ALL the rows that entered the Conditional Split - but they only operate on those rows with the right output indicator in that special hidden column.
The net effect of this behaviour, in combination with how the Sort component works, is an accumulation in memory of a set of very sparse data flow buffers. In my flow, one of my Conditional Split outputs - the one with 97% of the rows - is processed in a part of the flow where they get "consumed" by a Merge Join. The buffers that contain those rows ought to get released from memory, because those rows are no longer used. Except... for the 3% that went out the other output from the Conditional Split.
The 3% flow gets "consumed" by a Sort component. Except that I don't really think they do (right away). I believe that the Sort simply queues up the buffers, waiting for all of them to be given to it. It's doing what it's been designed to do - acting like a fully blocking component that needs to see all the buffers/rows before it emits any. However, the Sort is acting particularly poorly (in this case) because it's keeping the incoming rows inside the incoming buffers. Normally this isn't bad, and would actually be very smart. In this edge case of ours, it would make more sense to copy the incoming rows to another internal memory structure to release the 97% of rows that are "empty" in those buffers. But in normal cases, where you're sorting all (or almost all of) the rows inside an incoming buffer, it would be a waste of CPU and memory to make a copy of the incoming rows... that you're just going to discard anyway once you emit the sorted rows.
Bottom line - the Sort ends up holding on to (almost) EVERY ONE of my sparsely populated buffers. Buffers sized to hold 22 million rows, but only holding 1 million rows marked for its use by the Conditional Split.
The Solution - Defragment Your Buffers!
For those of you that scrolled to the bottom to get the TL;DR solution... you shouldn't have, because this probably won't work for you. It'll only work for those people that understand the problem they're having.
That said, it's fairly harmless to try this solution out. But do record baseline performance metrics on what your data flow's memory, CPU, disk, and time usage looks like so that you can determine if there's an improvement or not.
How do you "defragment" those sparsely populated buffers? Use a Union All.
The Union All is another asynchronous component just like the Sort - but it doesn't block like the Sort does. The Sort component won't process any inbound row until it has accumulated all of them. The Union All processes every incoming row as soon as it sees it... it just doesn't emit output buffers until it has full ones (or it's reached the end of the data).
We're going to use that behaviour to our advantage here. If you simply drop a Union All in your data flow between your Conditional Split and Sort, you'll be telling SSIS to behave in a much better manner for your (and my) unusual data flow. The "sparse" buffers that get emitted by the Conditional Split will get "consumed" by the Union All. (In actuality, I believe those rows get marked as "dead" by the data flow engine, and when all rows in a buffer are marked "dead", the engine can dispose of the buffer.) New buffers emitted by the Union All are dense, requiring much less "commit charge" to hold.
It worked wonders for me on this particular data flow. My Commit Charge was only a few hundred KB higher than my Private Working Set for the package execution, and I received no memory warnings or errors on multiple executions. If you know you're manufacturing "sparse" buffers in your Data Flow - a Union All might be able to help you out.