Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 123»»»

Slowly Changing dimension inside a ForEach Loop container Expand / Collapse
Author
Message
Posted Friday, August 15, 2014 3:17 PM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Wednesday, September 10, 2014 11:02 AM
Points: 47, Visits: 136
Hi there folks,

Sorry for the lengthy topic but this is my first post - as you can tell!

I am having memory issues with my SSIS package.

I am running SQL Server 2005 Standard Edition on a Windows Server 2003 SP2 Enterprise Edition machine (Intel Xeon 5160 @ 3.00GHz, 4GB RAM) - yes, I am aware how obsolete and old the machine is!

Anyway, I have over 4 million records I need to load from a table that has been capturing weekly data from a HR system into a data warehouse.

The SSIS I have created goes through a ForEach Loop where it picks up one employee at a time (there are around 96,000 employees). There is another ForEach Loop that picks up the extract date. Inside the second ForEach Loop there is a Data Flow Task that has an OLE DB Source. I have used a SQL Command and I perform all the look ups in the SQL command. At this stage I am not concerned about null values from the lookup as I have done all this prior to loading the data. Therefore, I know that the SQL command with the lookups will return valid values. After the OLE DB source I have an SCD. I need to keep history for most of the columns in the table that I am loading. I am using 2 business keys in my SCD. The SCD works fine and does what it has to do.

The problem arises after some time. Normally at around 1 to 1.5 hrs the SSIS package fails with the following error (values differ from execution to execution):

Microsoft (R) SQL Server Execute Package Utility
Version 9.00.5000.00 for 32-bit
Copyright (C) Microsoft Corp 1984-2005. All rights reserved.

Started: 3:19:00 PM
Error: 2014-08-15 16:24:50.91
Code: 0xC0047012
Source: EmployeeFact with SCD DTS.Pipeline
Description: A buffer failed while allocating 9109056 bytes.
End Error
Error: 2014-08-15 16:24:50.91
Code: 0xC0047011
Source: EmployeeFact with SCD DTS.Pipeline
Description: The system reports 86 percent memory load. There are 4293980160 bytes of physical memory with 583577600 bytes free. There are 2147352576 bytes of virtual memory with 240123904 bytes free. The paging file has 9486200832 bytes with 4991451136 bytes free.
End Error
Error: 2014-08-15 16:24:50.91
Code: 0xC0047056
Source: EmployeeFact with SCD
Description: The Data Flow task failed to create a buffer to call PrimeOutput for output "OLE DB Source" (1) on component "OLE DB Source Output" (11). This error usually occurs due to an out-of-memory condition.
End Error
Error: 2014-08-15 16:24:50.91
Code: 0xC0047021
Source: EmployeeFact with SCD
Description: SSIS Error Code DTS_E_THREADFAILED. Thread "SourceThread0" has exited with error code 0x8007000E. There may be error messages posted before this with more information on why the thread has exited.
End Error
Error: 2014-08-15 16:24:50.91
Code: 0xC0047039
Source: EmployeeFact with SCD
Description: SSIS Error Code DTS_E_THREADCANCELLED. Thread "WorkThread0" received a shutdown signal and is terminating. The user requested a shutdown, or an error in another thread is causing the pipeline to shutdown. There may be error messages posted before this with more information on why the thread was cancelled.
End Error
Error: 2014-08-15 16:24:50.91
Code: 0xC0047039
Source: EmployeeFact with SCD
Description: SSIS Error Code DTS_E_THREADCANCELLED. Thread "WorkThread1" received a shutdown signal and is terminating. The user requested a shutdown, or an error in another thread is causing the pipeline to shutdown. There may be error messages posted before this with more information on why the thread was cancelled.
End Error
Error: 2014-08-15 16:24:50.91
Code: 0xC0047021
Source: EmployeeFact with SCD
Description: SSIS Error Code DTS_E_THREADFAILED. Thread "WorkThread0" has exited with error code 0xC0047039. There may be error messages posted before this with more information on why the thread has exited.
End Error
Error: 2014-08-15 16:24:50.91
Code: 0xC0047021
Source: EmployeeFact with SCD
Description: SSIS Error Code DTS_E_THREADFAILED. Thread "WorkThread1" has exited with error code 0xC0047039. There may be error messages posted before this with more information on why the thread has exited.
End Error
DTExec: The package execution returned DTSER_FAILURE (1).
Started: 3:19:00 PM
Finished: 4:24:52 PM
Elapsed: 3952 seconds


I monitor the server at all times when I am running this package and the CPU usage is always below 50%. I don't really have good counters to measure the memory usage but when I do a Window Task Manager I see the PF usage is hovering around 4 and 4.1 GB.

I also have played with the DefaultBufferMaxRows and DefaultBufferSize but I found that if I kept at the default values (10000 for DefaultBufferMaxRows and 10485760 for DefaultBufferSize) the package runs for longer than if I increase any of those values.

I am going nuts trying to find out what could be the reason for this memory failure? Is it the way the package is configured or is it the server (hardware)?

Please any input would be much appreciated!

Thanks
Romina
Post #1603892
Posted Saturday, August 16, 2014 4:29 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Yesterday @ 7:23 AM
Points: 5,074, Visits: 11,852
96,000 times round a Foreach loop sounds like a performance-killer to me.

It would be useful background to understand why you have decided to do things this way rather than using a set-based approach.



Help us to help you. For better, quicker and more-focused answers to your questions, consider following the advice in this link.

When you ask a question (and please do ask a question: "My T-SQL does not work" just doesn't cut it), please provide enough information for us to understand its context.
Post #1604027
Posted Sunday, August 17, 2014 7:54 PM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Wednesday, September 10, 2014 11:02 AM
Points: 47, Visits: 136
Hi Phil,

Thank you for your reply.

I am not sure I get you when you say a set based approach.

I was just really trying to minimize the number of records entering the SCD with a ForEach Loop. If I just go ahead and don't use the ForEach loop my SSIS package will take hours to complete because of the SCD.

Thanks
Romina
Post #1604257
Posted Sunday, August 17, 2014 11:52 PM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Yesterday @ 7:23 AM
Points: 5,074, Visits: 11,852
The SCD transformation can perform badly when compared against other approaches.

What type of SCD are you using?

When I refer to a set-based approach, I am referring to performing actions on groups ('sets') of rows concurrently, rather than using any sort of looping. This is usually the standard and optimal way of working against data which is held in a SQL Server database.



Help us to help you. For better, quicker and more-focused answers to your questions, consider following the advice in this link.

When you ask a question (and please do ask a question: "My T-SQL does not work" just doesn't cut it), please provide enough information for us to understand its context.
Post #1604270
Posted Monday, August 18, 2014 1:09 AM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Today @ 4:05 AM
Points: 13,724, Visits: 10,680
I have an article on how to implement SCD in SSIS using different methods and I compare them against each other.

Four Methods for Implementing a Slowly Changing Dimension in Integration Services – Part II

Giving that you work with SSIS 2005, I would go with the last method: custom implementation (with custom I mean build it yourself using standard components in SSIS, instead of using 3rd party components).




How to post forum questions.
Need an answer? No, you need a question.
What’s the deal with Excel & SSIS?

Member of LinkedIn. My blog at LessThanDot.

MCSA SQL Server 2012 - MCSE Business Intelligence
Post #1604290
Posted Monday, August 18, 2014 8:09 AM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Wednesday, September 10, 2014 11:02 AM
Points: 47, Visits: 136
Phil,

I am using the SCD wizard from SSIS 2005. Mostly Type 2 changes with very few Type 1.

What I am trying to achieve is retrieve all 4 million records from the old system to the new and the only way to group them is through the EmployeeID and the ExtractDate (a Datetime field that reflects when the extract was made from the old system to the new).

The data extract is done via a stored procedure run every week and literally "dumped" into a huge table that now has over 4 million records. Records for each employee for each date the extract was made. There are around 96,000 employees.

Initially the records will be mostly new but there are a lot of duplicated records that need to be cleaned out before loading into the new datawarehouse system. For example, for a particular employee there might be 141 records in this "huge dumping table" but when I perform the SCD, only 14 records make it to the DW after getting rid of duplicates and updating historical records.

Working with an employee at a time is fine, but with 96,000 employees the SCD simply takes way too long, and this is what I was trying to avoid with the ForEach loop.

I guess I will have to try Koen's approach and build a custom SCD.

Just one last question Phill, how much is the hardware (the server configuration) affecting performance? Or is it purely my SSIS that needs to be configured better?

Thanks again. I appreciate your help with this!
Post #1604464
Posted Monday, August 18, 2014 8:13 AM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Wednesday, September 10, 2014 11:02 AM
Points: 47, Visits: 136
Koen, thank you for this. Very helpful!

In the custom implementation example, is that a Merge or a Merge Join?

If it's simply a Merge, can I use the Union All instead?

In SSIS 2005 you need to sort the source before Merging it, which might cause performance issues. What do you think?

Thanks
Romina
Post #1604467
Posted Monday, August 18, 2014 8:15 AM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Today @ 4:05 AM
Points: 13,724, Visits: 10,680
romina (8/18/2014)
Koen, thank you for this. Very helpful!

In the custom implementation example, is that a Merge or a Merge Join?

Thanks
Romina


Neither.
Which number in the screenshot of the data flow are we talking about?




How to post forum questions.
Need an answer? No, you need a question.
What’s the deal with Excel & SSIS?

Member of LinkedIn. My blog at LessThanDot.

MCSA SQL Server 2012 - MCSE Business Intelligence
Post #1604468
Posted Monday, August 18, 2014 8:17 AM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Wednesday, September 10, 2014 11:02 AM
Points: 47, Visits: 136
4 and 9
Post #1604469
Posted Tuesday, August 19, 2014 12:42 AM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Today @ 4:05 AM
Points: 13,724, Visits: 10,680
romina (8/18/2014)
4 and 9


Ah yes, those are UNION ALL components (designated by the (ALL) naming conventions).
I used "merge" in their names because they combine two streams of data together.




How to post forum questions.
Need an answer? No, you need a question.
What’s the deal with Excel & SSIS?

Member of LinkedIn. My blog at LessThanDot.

MCSA SQL Server 2012 - MCSE Business Intelligence
Post #1604764
« Prev Topic | Next Topic »

Add to briefcase 123»»»

Permissions Expand / Collapse