SQL SERVER 2008 - SSIS PACKAGE HANGS

  • I would suggest you look into it, as you are queries ID rows, which I would assume is Unique, and I would assume the ID column has a primary key?

    Adding a index on the ID row will increase lookup times.

    -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    This thing is addressing problems that dont exist. Its solution-ism at its worst. We are dumbing down machines that are inherently superior. - Gilfoyle

  • I created index on ec_step_id :crying:, that did not work too .......again hanged....

  • Hey i have inserted the Data manually i mean by insert statement giving the same syntax to select the Data from Temp,

    After inserting the Data , i ran my Package just to see if it works, and it quickly executed(as there is no new data to load i guess), without any hanging ...i looked into the data and data was pretty fine , i cannot understand what makes it hang while loading the data into Main table Via package.......

  • I see room for improvement here in several places.

    First, you've got to use constraints and indexes in your table design. Even if you get this working, you'll run into problems down the road as your tables grow. Proper database design is paramount in database performance. If you ignore it now, you'll pay later. You need Primary Keys and clustered indexes on your tables!

    Second, for your source adapter query, the query you have is terribly inefficient. From the looks of it, you are basically trying to select rows from your temp table that do not exist in your destination table. A simple OUTER join would server you better than your NOT IN attempts. Using NOT IN like you've done means that SQL Server must reference the temp table twice in the same query. This results in way more reads than necessary. Here's a better replacement:

    SELECT t.*

    FROMdbo.DW_T_EC_JOB_TEMP t

    LEFT JOIN dbo.DW_T_EC_JOB jON j.EC_STEP_ID = t.EC_STEP_ID

    WHEREj.EC_STEP_ID IS NULL

    Third, you are using the source adapter to JOIN your source table (the temp table) with the destination table. This defeats the purpose of the data flow task in SSIS. You should get ALL rows from your temp table and use data flow transformations to determine which rows you need to insert into your destination. A lookup transformation task would do this quite nicely (given you have proper indexing on your destination table). So you read all rows in from your source, applying any source available filters to the data, and you use the data flow tasks to eliminate rows that shouldn't end up at the destination.

    This removes much of the burden of the processing from the database engine and allows the SSIS service to do the work.

    Fourth, why do you need 2 packages to do this? If package #2 is only responsible for moving the data between your temp table and the final destination table and both are hosted by the same DB, why not just use a stored procedure? Using temp tables for staging data is often times an unnecessary step when working with SSIS.

    I would recommend that you get rid of package #2. You should have package #1 get the rows from the source, perform any required transformations to the data, and load the data directly into your destination table. This is what SSIS is designed for....ETL (Extract - Transform - Load). At the very least, replace package #2 with a stored procedure....but I would incorparate it all into package #1.

    John Rowan

    ======================================================
    ======================================================
    Forum Etiquette: How to post data/code on a forum to get the best help[/url] - by Jeff Moden

  • Thank you so much for the explanation

    Actually i am new to SSIS packages, someone suggested that having a temp table as stataging would let us have the back data or check the data before loading if there any discrepencies , but i dont really completely agree to this as we are deleting the temp table data everytime the pacakge runs...

    So i want to try as you said and try if my package runs efficiently

    Fisrt: Yes i have created Index

    Second: I removed the Not in Query from My procedure and have syntax changed to below ...This is the SP i run for getting the Data from Temp Table to Main Table

    ALTER PROCEDURE [dbo].[Proc_DiffData_EC_JOB]

    AS

    BEGIN

    SET NOCOUNT ON;

    Select * from dbo.DW_T_EC_JOB_TEMP t

    where

    t.EC_STEP_ID not in

    (select j.EC_STEP_ID from DW_T_EC_JOB j

    INNER JOIN DW_T_EC_JOB_TEMP t

    ON

    j.EC_STEP_ID = t.EC_STEP_ID)

    END

    Third:

    Could you please help mw creating the Look up Transformation ..

    The Basic stucture of my package what i had is

    First Packages gets 7 days worth of data from another database and loades into Temp Table in My database,

    Second Package looks for the new EC_STEP_ID (BASICALLY NEW RECORDS) in the Temp Table which are not there in Main table and loads them to the main table....

    Hope This solves My problem once in for all, i spent almost 2 days just to trouble shoot this :-(..(I am new to my Job)

    Please Help....

    John Rowan (3/19/2010)


    I see room for improvement here in several places.

    First, you've got to use constraints and indexes in your table design. Even if you get this working, you'll run into problems down the road as your tables grow. Proper database design is paramount in database performance. If you ignore it now, you'll pay later. You need Primary Keys and clustered indexes on your tables!

    Second, for your source adapter query, the query you have is terribly inefficient. From the looks of it, you are basically trying to select rows from your temp table that do not exist in your destination table. A simple OUTER join would server you better than your NOT IN attempts. Using NOT IN like you've done means that SQL Server must reference the temp table twice in the same query. This results in way more reads than necessary. Here's a better replacement:

    SELECT t.*

    FROMdbo.DW_T_EC_JOB_TEMP t

    LEFT JOIN dbo.DW_T_EC_JOB jON j.EC_STEP_ID = t.EC_STEP_ID

    WHEREj.EC_STEP_ID IS NULL

    Third, you are using the source adapter to JOIN your source table (the temp table) with the destination table. This defeats the purpose of the data flow task in SSIS. You should get ALL rows from your temp table and use data flow transformations to determine which rows you need to insert into your destination. A lookup transformation task would do this quite nicely (given you have proper indexing on your destination table). So you read all rows in from your source, applying any source available filters to the data, and you use the data flow tasks to eliminate rows that shouldn't end up at the destination.

    This removes much of the burden of the processing from the database engine and allows the SSIS service to do the work.

    Fourth, why do you need 2 packages to do this? If package #2 is only responsible for moving the data between your temp table and the final destination table and both are hosted by the same DB, why not just use a stored procedure? Using temp tables for staging data is often times an unnecessary step when working with SSIS.

    I would recommend that you get rid of package #2. You should have package #1 get the rows from the source, perform any required transformations to the data, and load the data directly into your destination table. This is what SSIS is designed for....ETL (Extract - Transform - Load). At the very least, replace package #2 with a stored procedure....but I would incorparate it all into package #1.

  • First, your stored procedure will still be inefficient. Replace your query with the query that I posted in my last post. You are still using NOT IN....change that to the LEFT JOIN version that I supplied. But the fact remains, you really don't need the SP if you are planning on incorporating the work into your first package.

    So here's how you set your first package up to do all of the work. Most of the work that you'll do to get your data from the source to destination will happen inside a data flow task. I imagine that you already have a data flow task that handles moving the data into your staging table.

    So here's how your data flow should look:

    1. Source adapter that gets the data from your source.

    2. Next, include any other data validations that you need to do.

    3. Use a lookup transformation. Insert the lookup transformation into the data flow pipeline. Connect it to the tasks from above. Now, inside the lookup transformation, set the connection manager to point to your destination DB and table. On the columns tab, drag the STEP_ID from the source input and drop it on the STEP_ID column in the Available Lookup Columns table. This will tell SSIS that it needs to compare the STEP_IDs from your source to the STEP_IDs in the destination table. Now, go to the Configure Error Output screen (the big button at the bottom). There, set the Error column value to 'Redirect Row'.

    The way the lookup transformation behaves, it considers it an error when the lookup fails. Since your looking for rows that do not exist and the lookup considers these as errors, you need to set the Error column to 'Redirect Row'. Press OK and close the Lookup Transformation Editor.

    4. Destination Adapter. Set the destination to your destination table.

    Now, since we've told the Lookup Transformation to redirect the error rows (which aren't really errors, they're just rows that don't exist in the lookup table, which in this case is the destination table) you can now take the red arrow and send it into your destination. The lookup rows that existed in the destination were sent to the green arrow. Since we're only using the red arrow, rows that already exist in the destination are simply ignored since we've not connected the green arrow anywhere. In a sense, we've just configured the Lookup Transformation to behave just like an OUTER JOIN.

    John Rowan

    ======================================================
    ======================================================
    Forum Etiquette: How to post data/code on a forum to get the best help[/url] - by Jeff Moden

  • could you please let me know why we have used where ec_step_id is Null?

    As there are no nulls in ec_step_id.....

  • rds207 (3/19/2010)


    could you please let me know why we have used where ec_step_id is Null?

    As there are no nulls in ec_step_id.....

    It's how you write an outer join when you want to look for rows that do not have a match in the outer table. Let's disect the query and explain the JOIN logic behind it. I think once you understand what's going on under the hood, it will seem quite simple for you.

    Consider this test data:

    --==== Set up test tables

    --====Table 1 is the source table, Table 2 is destination

    DECLARE @Table1 TABLE (STEP_ID int, somevalue varchar(25))

    INSERT INTO @Table1

    SELECT 1, 'row one - source' union all

    SELECT 2, 'row two - source' union all

    SELECT 3, 'row three - source' union all

    SELECT 4, 'row four - source' union all

    SELECT 5, 'row five - source' union all

    SELECT 6, 'row six - source'

    DECLARE @Table2 TABLE (STEP_ID int, somevalue varchar(25))

    INSERT INTO @Table2

    SELECT 1, 'row one - destination' union all

    SELECT 2, 'row two - destination' union all

    SELECT 3, 'row three - destination' union all

    SELECT 4, 'row four - destination'

    Now, the question you are asking is a pretty basic JOIN technique. Let's first look at a simple INNER JOIN result set...

    --==== INNER JOIN - find rows that exist in both tables

    SELECT t1.STEP_IDas [Table1 STEP_ID],

    t1.somevalueas [Table1 Value],

    t2.STEP_IDas [Table2 STEP_ID],

    t2.somevalueas [Table2 Value]

    FROM@Table1 t1

    INNER JOIN @Table2 t2 ON t1.STEP_ID = t2.STEP_ID

    Here's the results it produces. Just as expected, we see the rows that are common between the 2 tables based on STEP_ID.

    Table1 STEP_ID Table1 Value Table2 STEP_ID Table2 Value

    -------------- ------------------------- -------------- -------------------------

    1 row one - source 1 row one - destination

    2 row two - source 2 row two - destination

    3 row three - source 3 row three - destination

    4 row four - source 4 row four - destination

    OK, now let's look at an OUTER JOIN. Here's a query that shows the full result set from a LEFT OUTER JOIN between the 2 tables:

    --==== LEFT OUTER JOIN w/o IS NULL filter to show the

    --==== result set

    SELECT t1.STEP_IDas [Table1 STEP_ID],

    t1.somevalueas [Table1 Value],

    t2.STEP_IDas [Table2 STEP_ID],

    t2.somevalueas [Table2 Value]

    FROM@Table1 t1

    LEFT JOIN @Table2 t2 ON t1.STEP_ID = t2.STEP_ID

    Now keep in mind that all a JOIN does is combine 2 result sets on a given criteria. The INNER JOIN combined the result sets of Table1 and Table2 where there were common STEP_IDs. A LEFT OUTER JOIN will show the combined result set of all rows from the LEFT table, plus all common rows from the OUTER table (in this case table2).

    Here's the results it produces:

    Table1 STEP_ID Table1 Value Table2 STEP_ID Table2 Value

    -------------- ------------------------- -------------- -------------------------

    1 row one - source 1 row one - destination

    2 row two - source 2 row two - destination

    3 row three - source 3 row three - destination

    4 row four - source 4 row four - destination

    5 row five - source NULL NULL

    6 row six - source NULL NULL

    Notice that all of the rows from Table1 are present. All of the matching rows from Table2 show their values, but the Query Processor brings back NULL values for those rows that have no match in the OUTER table.

    Enter the IS NULL filter. Now, with the above result set, we only want those rows that do not exist in the outer table, Table2 so we add a condition in our WHERE clause to give us rows where Table2.STEP_ID IS NULL. This has nothing to do with the actual values in your source data; rather, the condition is applied to the result set created by the LEFT OUTER JOIN. As show above, the LEFT OUTER JOIN between these 2 tables creates NULL values in the non-existant rows. Adding the WHERE Table2.STEP_ID IS NULL filter to the WHERE clause tells the Query Processor that you only want rows that DO NOT have a match in the outer table.

    Here's our 'almost' final query:

    --==== LEFT OUTER JOIN w IS NULL filter to show the

    --==== full result set with NULL condition applied

    SELECT t1.STEP_IDas [Table1 STEP_ID],

    t1.somevalueas [Table1 Value],

    t2.STEP_IDas [Table2 STEP_ID],

    t2.somevalueas [Table2 Value]

    FROM@Table1 t1

    LEFT JOIN @Table2 t2 ON t1.STEP_ID = t2.STEP_ID

    WHEREt2.STEP_ID IS NULL

    You'll notice that adding the IS NULL check limits the result set to only show those rows that exist in Table1 but not in Table2.

    Table1 STEP_ID Table1 Value Table2 STEP_ID Table2 Value

    -------------- ------------------------- -------------- -------------------------

    5 row five - source NULL NULL

    6 row six - source NULL NULL

    I show you the entire result set (both Table1 and Table2 values) so that you can see how the Query Processor sees it--as 2 tables, joined together.

    Now, to finalize the query, you limit what you bring back in the SELECT by only looking at the non-null Table1 values:

    --==== Now, only view Table1 data

    SELECT t1.STEP_IDas [Table1 STEP_ID],

    t1.somevalueas [Table1 Value]

    FROM@Table1 t1

    LEFT JOIN @Table2 t2 ON t1.STEP_ID = t2.STEP_ID

    WHEREt2.STEP_ID IS NULL

    Here, with the new result set, you see that this method only brings back Table1 values that do not have a match in Table2 on STEP_ID.

    Table1 STEP_ID Table1 Value

    -------------- -------------------------

    5 row five - source

    6 row six - source

    John Rowan

    ======================================================
    ======================================================
    Forum Etiquette: How to post data/code on a forum to get the best help[/url] - by Jeff Moden

  • One more thing to keep in mind.....if you plan to combine the 2 packages into one, you no longer need the Stored Procedure so everything I stated above was really just to answer your question about the OUTER JOIN.

    John Rowan

    ======================================================
    ======================================================
    Forum Etiquette: How to post data/code on a forum to get the best help[/url] - by Jeff Moden

  • wow!!! Thats Amazing ...i replaced the syntax it worked !!!Initailly it got stuck at the same point i waited for while and amazingly it ran and completed the Job ...

    Thank you very very much...I really apreciate all your effort in explaining me the stuff..

  • Glad to help.

    So now that you've fixed the performance problem with your SP, I think you still have some work to do. Go back to my first post and re-read it. If your source adapter is using this code as it's source, it's still defeating the purpos of doing this in a data flow.

    I still recommend that you proceed with either:

    1. Eliminate pacakge #2 and use a stored procedure to move the data between your temp table and final table.

    2. Eliminte package #2 and use the lookup transformation in package #1 to bypass the temp table and move the rows directly into the final table.

    3. Use the lookup tranformation in package #2 to do the compare for you instead of using the SP. Configure the source adapter to get all rows from the temp table (instead of using the SP logic). and use the lookup to determine which rows need to get to the final table.

    John Rowan

    ======================================================
    ======================================================
    Forum Etiquette: How to post data/code on a forum to get the best help[/url] - by Jeff Moden

  • ok, so now am trying with Lookup transformation , the stored procedure i tried for the other package is running fine , the source for that package was from another database , i actually joined 3 tables and transfered the data to temp table and ran the stored procudure what you said and got the new rows to main table....

    Now i want to try with Look up transformation for another package as these package is getting the data from same database....

    I will explain what i did please let me know if iam doing it right

    1.Initially i created a package and for Source OLEDB gave the syntax below:

    SELECT

    T.EC_JOB_IND,

    T.JOB_NAME,

    T.PROCEDURE_NAME,

    T.PROJECT_NAME ,

    T.JOB_ID ,

    T.JOB_STEP_NAME,

    T.CREATED_TIME ,

    T.START_TIME ,

    T.FINISH_TIME ,

    T.RUNNABLE_TIME,

    T.OWNER ,

    T.HOST_NAME,

    T.RESOURCE_NAME,

    T.OUTCOME ,

    T.STATUS ,

    T.STEP_INDEX,

    T.START_MILLIS,

    T.FINISH_MILLIS,

    T.RUNNABLE_MILLIS,

    T.EC_STEP_ID ,

    T.PROJECT_ID ,

    T.LAUNCHED_BY_USER ,

    T.PARENT_ID,

    T.JOB_PRIORITY,

    JOB_TYPE = CASE WHEN T.PROJECT_NAME LIKE 'QCOM.QCT.MODEM%' THEN 'HCI-MODEM'

    WHEN T.PROJECT_NAME LIKE 'MODEM%' THEN 'HCI-MODEM'

    WHEN T.PROJECT_NAME LIKE 'QCOM.QCT.SQM%'THEN 'HCI-SQM'

    WHEN T.PROJECT_NAME LIKE 'AUDIO%' THEN 'HCI-AUDIO'

    WHEN T.PROJECT_NAME LIKE 'QCOM.QCT.MULTIMEDIA%'THEN 'HCI-MULTIMEDIA'

    WHEN T.PROJECT_NAME LIKE 'QCOM.QCT.WCONNECT%'THEN 'HCI-WCONNECT'

    WHEN T.PROJECT_NAME LIKE 'BREW%'THEN 'BREW'

    END

    FROM DW_T_EC_JOB t

    LEFT JOIN DW_T_EC_HCI_BREW j ON j.EC_STEP_ID = t.EC_STEP_ID

    WHERE j.EC_STEP_ID IS NULL AND T.EC_JOB_IND IN ('HCI','BREW')

    2.The i added Lookup transformation ,

    and on general tab, select redirect rows to error output

    on connection Tab i gave selected table DW_T_EC_HCI_BREW (My destination)

    on columns Tab , i select ec_step_id and mapped it to available look up columns

    3.Added Destination OLEDB and connected red arrow to destination table DW_T_EC_HCI_BREW

    Saved it and Ran... its is in debugging mode....for past 15min , i think it got hanged at lookerroroutput rows 2248...

    Did i do anything wrong?Attached is the screen shot of DataFlow...

Viewing 12 posts - 16 through 27 (of 27 total)

You must be logged in to reply to this topic. Login to reply