SSIS Data Flow Task Crashing

Question

SSIS Data Flow Task Crashing

kevin.kembel

Ten Centuries

Points: 1288
More actions
April 15, 2013 at 4:00 pm

#292528

I have a simple data flow task that moves logging events from one database to another. It pulls log events using an OleDb datasource (tried using data access mode of "Table or View" as well as "SQL Command"). Then it does multicast, and an OleDb lookup of data in 5 related tables to denormalize. I've tried using full cache, partial cache, and no cache for each of the lookup tasks. Then it inserts into one of the 5 corresponding logging tables, which I've tried using the OleDb destinations with and without fast load. The source and destination databases are all on another server.
It seems to work fine in small volume (<50,000 records), but crashes at random points with higher volume. The point it crashes at varies in about 10 test runs, sometimes as low as after 70,000 records pulled from the datasource, and sometimes as high as 980,000 records pulled from the datasource. I'm trying to run it through 1,200,000 records, and it has come close but crashes before it completes every time.
I've tried executing in BIDS while debugging, but it just stops and the output window shows "Finished: Canceled". I've also tried from the command line as well as SQL Agent, executing under a few different accounts from network service to domain administrator. Each crash logs a dump file to the file system, which shows the exception "The thread tried to read from or write to a virtual address for which it does not have the appropriate access".
In some readings, it seems this is resource related, virtual memory problem maybe? The system has 16GB of RAM and runs integration services almost exclusively. The process seems to take up 1.5GB of RAM at the peak, and when I monitor resources it seems to never have less than about 8GB of RAM free in the system, so RAM doesn't seem to be a choke point.
I've tried raising and lowering the "DefaultBufferMaxRows", "DefaultBufferSize" and "EngineThreads", but without any real understanding of their impact. In any event, it doesn't seem to affect stability if I modify those values. I've also tried specifying the "BufferTempStoragePath" to a fast SSD drive with plenty of space, but the data flow never seems to need the buffer temp storage - it stays empty.
By default, there's 16GB of swap space on the system drive, and I've tried tripling it, adding 16GB to two other drives each, but doesn't seem to impact it.
This system is a Windows Server 2008 R2 Virtual Machine running in a Windows Server 2012 Hyper-V Host. SQL is Standard 2012.

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply

kevin.kembel Ten Centuries Points: 1288 More actions · Answer 1

I think I've found a workaround, but I'm not sure why this has to be done this way.

Each log even will only have 1 matching lookup out of the 5. So instead of having a multicast event that feeds that data into each of the 5 lookups, I just feed the data into the 1st lookup. Any non-matching rows from the first lookup are passed on to the 2nd lookup, then the 3rd and so forth.

This works if I don't multicast, and I've successfully executed a test run of 2,400,000 records without any issues.

But I'm not sure why the multicast would cause this crash, has anyone experienced this before?

EDIT: None of the other changes matter to stability if I flow the data this way. It works with fast load, full cache, no cache, etc... so it does seem to be the multicast.

Evil Kraig F SSC Guru Points: 100851 More actions · Answer 2

It sounds like it's running out of memory, offhand. Have you watched the system bottlenecks while this runs? If you're in BIDS, you need to watch your own machine.

- Craig Farrell

Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA

kevin.kembel Ten Centuries Points: 1288 More actions · Answer 3

I have, and there's plenty of RAM free (8GB of 16GB free when it crashes), but is there another limitation that I'm running into? Something per-process?

I read that there's a 2GB limit for each process, but as far as I can tell that's only for the 32-bit and I'm running 64. Or is there somewhere else I should be looking at for memory usage?

Evil Kraig F SSC Guru Points: 100851 More actions · Answer 4

kevin.kembel (4/16/2013)
I have, and there's plenty of RAM free (8GB of 16GB free when it crashes), but is there another limitation that I'm running into? Something per-process?
I read that there's a 2GB limit for each process, but as far as I can tell that's only for the 32-bit and I'm running 64. Or is there somewhere else I should be looking at for memory usage?

Apologies Kevin, you did mention that in your post above and somehow I skimmed past it to get to the update in the second post.

I don't have 2012 to experiment with, unfortunately, but some of these are other items.

The error itself seems more related to the Visual Studio environment than it does to the SSIS engine. Out of curiousity, did you run the original package outside of VS at all and get the same errors? IE: Fired it up from a SQL Agent task? You mention you tried to execute while debugging, I'm jsut not sure where else you've ran this from that you also get errors.

In regards to your other questions, cache'ing on the lookups just has to do with the lookup data. It'll apply some memory constraints but usually nothing horrible unless you're loading it with A few thousand VARCHAR(100) fields or something equally large.

What's the rowsize of the data in the flow? Are you working with BLOB fields in the flow?

If you really want to get deep into SSIS optimization, an interesting place to start:

http://technet.microsoft.com/library/Cc966529

In particular in that article is the following information:

Estimated Row Size – Estimated Row Size is not a specific SSIS setting. Rather, it is something that SSIS calculates based on the metadata that it collects about your source data at design time. You can shrink the row size by identifying the smallest possible data types for all of your columns as early in the data flow as possible. This is especially important for flat file data sources because each column is automatically read into SSIS as a string data type unless you configure the column’s data type explicitly.
DefaultMaxBufferRows – DefaultMaxBufferRows is a configurable setting of the SSIS Data Flow task that is automatically set at 10,000 records. SSIS multiplies the Estimated Row Size by the DefaultMaxBufferRows to get a rough sense of your dataset size per 10,000 records. You should not configure this setting without understanding how it relates to DefaultMaxBufferSize.
DefaultMaxBufferSize – DefaultMaxBufferSize is another configurable setting of the SSIS Data Flow task. The DefaultMaxBufferSize is automatically set to 10 MB by default. As you configure this setting, keep in mind that its upper bound is constrained by an internal SSIS parameter called MaxBufferSize which is set to 100 MB and can not be changed.
MinBufferSize – While MinBufferSize is not configurable, it is still important to know about it because SSIS uses this internal parameter to gauge whether you have set the DefaultMaxBufferSize too low. MinBufferSize is defined by the granularity of your operating system’s virtual memory allocation functionality. Typically, this is set to 65,536 bytes, but it differs from machine to machine.

This is part of why I ask how large is your row? In 32 bit it's 4GB for a process, in 64bit it's... well, bigger than your memory can probably physically insert into the boards. 🙂

I realize I may be asking a few questions that are at 'is the thing plugged in' levels, but I just want to make sure I've got apples to apples going before we dive deeper... particularly with BLOB fields.

- Craig Farrell

Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA

kevin.kembel Ten Centuries Points: 1288 More actions · Answer 5

Thanks for your reply again, sorry it took a bit for me to get back in here.

I've tried running from BIDS debugger, from dtexec command line, and as a SQL Agent job, and all of the crash and create the same dump file.

My data types are fairly simple, no calculated fields on sources or destinations, and a maximum of 15 columns depending on the lookup. All of the data types are either uniqueidentifier, int, bit, or varchar with a size of 102 or less. No blobs, and each record's size is quite small.

The lookup caching takes up quite a bit of RAM, about 1.5GB total with 5 lookups each. But like I mentioned before, the system shows about 8GB of RAM free at the time of the crash, so I'm not sure it's the issue.

I also don't see why a multicast flowing data into 5 lookups is so different than flowing data from one lookup to the next to the next (the chain-style flow works, but the web-style flow doesn't). When I run it now, with each lookup flowing into the next, it can actually handle significant amounts, I've successfully done up to 4 million rows this way, where I couldn't get anywhere near that using the multicast.

The performance tuning settings don't seem to apply here either, if I chain the lookups together, then I can set those however I want and it doesn't seem to affect stability - just performance as you'd expect. It just seems like I'm not able to multicast to 5 other tasks without it crashing?

EDIT: Not sure if I explicitly mentioned that there's zero transformations as well. It loads using OleDb from a view, straight to a multicast. Then multicast flows to the 5 lookups, and each lookup does an OleDb insert to one of 5 destination tables. If I restructure it to load from a view, then to the first lookup, then to the second, third, etc. then it works fine.

Evil Kraig F SSC Guru Points: 100851 More actions · Answer 6

I have no idea, honestly. At that point, I'd seriously look into calling Microsoft and having them examine the design. If anyone can find root cause, it's them.

There's just not enough information out there to troubleshoot that particular concern. 5x the data flowing through different streams I could see causing significant memory pressure... but you don't have any. I'm stumped.

Time to call in the devs. 🙂

- Craig Farrell

Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA