SSIS Question

  • Hi everyone.  I am re-writing a portion of my SSIS logic.  My SSIS package downloads daily stock market data and then performs calculations for each trading day.  Once a particular day's calculations have been done and stored there is no need to re-do this work when new data is added.  Only new data should be processed.  Version 1 does a full re-calculation of historical data plus new data.  This is not efficient.  Version 2 only does a calculation of new data and does not touch old data.

    Version 1

    Use Execute SQL Task Editor to run SP1

    Version 2

    Use Script Task Editor to run RunStoredProcedures.  RunStoredProcedures is a collection of multiple SPs so it includes SP1.  SP1 was not modified.

    RunStoredProcedures:

    RunStoredProcedures
    /****** Object: StoredProcedure [dbo].[RunStoredProcedures] Script Date: 2024-10-03 1:52:18 PM ******/
    SET ANSI_NULLS ON
    GO
    SET QUOTED_IDENTIFIER ON
    GO

    ALTER procedure [dbo].[RunStoredProcedures]
    as

    EXEC dbo.SP1
    EXEC dbo.SP2

    --truncate table dbo.DataFileProcessingDates

    --debugging purposes to test the rollback logic..remove after testing is done
    select 1/0

    Version 2 is producing below lines in SSIS output which I do not see in Version 1.  Why are they appearing?  How can I fix Version 2 so they no longer appear?  SP1 was not modified when Version 1 and Version 2 were run so the issue isn't related to SP1.  It is something to do with how the SP was run (my educated guess).

    Information: 0x0 at Run Stored Procedures: Message: Execution Timeout Expired.  The timeout period elapsed prior to completion of the operation or the server is not responding.
    Operation cancelled by user.
    Warning: Null value is eliminated by an aggregate or other SET operation.
    Warning: Null value is eliminated by an aggregate or other SET operation.
    Warning: Null value is eliminated by an aggregate or other SET operation.
    The statement has been terminated.
    Information: 0x0 at Run Stored Procedures: Rollback of RunStoredProcedures successful

    I am a rookie coder so I am quite sure my C# script below is not optimal.  If there are better ways to do the same thing please let me know.

    C# Script:

    public void Main()
    {
    // TODO: Add your code here

    try
    {
    string connectionString = @"Data Source=localhost;Initial Catalog=TestDB;Integrated Security=True";
    SqlConnection con = new SqlConnection(connectionString);
    con.Open();

    SqlCommand cmd = con.CreateCommand();
    SqlTransaction transaction;

    transaction = con.BeginTransaction();

    cmd.Connection = con;
    cmd.Transaction = transaction;

    try
    {
    string SQL = "dbo.RunStoredProcedures";

    cmd = new SqlCommand(SQL, con, transaction);

    cmd.CommandType = CommandType.StoredProcedure;

    cmd.ExecuteNonQuery();

    transaction.Commit();

    }
    catch (Exception ex1)
    {
    bool fireAgain = false;
    Dts.Events.FireInformation(0, null,
    string.Format("Message: {0}", ex1.Message),
    null, 0, ref fireAgain);

    try
    {
    transaction.Rollback();
    fireAgain = false;
    Dts.Events.FireInformation(0, null,
    string.Format("Rollback of RunStoredProcedures successful"),
    null, 0, ref fireAgain);
    }
    catch (Exception ex2)
    {
    fireAgain = false;
    Dts.Events.FireInformation(0, null,
    string.Format("Rollback Exception Type: {0}", ex2.GetType()),
    null, 0, ref fireAgain);

    }

    }
    //transaction.Commit();
    con.Close();
    Dts.TaskResult = (int)ScriptResults.Success;

    }
    catch (Exception ex3)
    {
    Dts.Events.FireError(0, "Exception from Script Task", ex3.Message + "\r" + ex3.StackTrace, String.Empty, 0);
    Dts.TaskResult = (int)ScriptResults.Failure;

    }
    }

    Thank you

     

     

  • google is your friend - you should have done it first.

     

    https://learn.microsoft.com/en-us/sql/t-sql/statements/set-ansi-warnings-transact-sql?view=sql-server-ver16

    regarding the code and because you got a timeout  I would add a command timeout before the executenonquery.

    cmd.CommandType = CommandType.StoredProcedure;
    cmd.CommandTimeout = 120; // this is in seconds - adjust as needed. use zero to never timeout
    cmd.ExecuteNonQuery();
  • frederico_fonseca wrote:

    google is your friend - you should have done it first.

    https://learn.microsoft.com/en-us/sql/t-sql/statements/set-ansi-warnings-transact-sql?view=sql-server-ver16

    regarding the code and because you got a timeout  I would add a command timeout before the executenonquery.

    cmd.CommandType = CommandType.StoredProcedure;
    cmd.CommandTimeout = 120; // this is in seconds - adjust as needed. use zero to never timeout
    cmd.ExecuteNonQuery();

    Thanks for the reply.  it works

    • This reply was modified 3 weeks ago by  water490.
  • First of all, the warning was telling you that you have columns that you're doing aggregates like SUM() and COUNT() on contain some NULL values.  Are you sure those columns are supposed to be  NULL?  If not, then you need to turn warnings back on and go figure out what the problem is with the missing data.

    The other thing is... you're doing this all in SSIS.... why are you using C# to execute your stored procedures instead of using SSIS to call them instead of just incorporating them into a process flow (I forget the real name for them in SSIS) ???

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • deleted post

    • This reply was modified 3 weeks ago by  water490.
  • Correct Jeff, execute SQL Task is your friend here.

    Screenshot 2024-10-06 065511

  • water490 wrote:

    Jeff Moden wrote:

    First of all, the warning was telling you that you have columns that you're doing aggregates like SUM() and COUNT() on contain some NULL values.  Are you sure those columns are supposed to be  NULL?  If not, then you need to turn warnings back on and go figure out what the problem is with the missing data.

    The other thing is... you're doing this all in SSIS.... why are you using C# to execute your stored procedures instead of using SSIS to call them instead of just incorporating them into a process flow (I forget the real name for them in SSIS) ???

    there are a bunch of SPs...some of them are doing prep work for another SP.  The warnings are known issues that have been addressed in the SP that consume the output of the prep SPs.

    My SSIS package does alot of other data prep so it makes sense to me to use it.

    One of the advantages to SSIS is the ability to set precedence constraints - so you can control which procedures are executed and what order they are executed.

    Along with this - you can also set them up to run parallel.  For example, if you have 3 procedures that can all run at the same time but all 3 need to be completed successfully before the 4th procedure is executed - you can create the constraints that way.

    Instead of running a single procedure - or a single task - I would break out each procedure and set them up as individual Execute SQL Tasks, with appropriate precedence constraints.  If any of the procedures can be run in parallel this could reduce overall execution time significantly.

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • Jeffrey Williams wrote:

    water490 wrote:

    Jeff Moden wrote:

    First of all, the warning was telling you that you have columns that you're doing aggregates like SUM() and COUNT() on contain some NULL values.  Are you sure those columns are supposed to be  NULL?  If not, then you need to turn warnings back on and go figure out what the problem is with the missing data.

    The other thing is... you're doing this all in SSIS.... why are you using C# to execute your stored procedures instead of using SSIS to call them instead of just incorporating them into a process flow (I forget the real name for them in SSIS) ???

    there are a bunch of SPs...some of them are doing prep work for another SP.  The warnings are known issues that have been addressed in the SP that consume the output of the prep SPs.

    My SSIS package does alot of other data prep so it makes sense to me to use it.

    One of the advantages to SSIS is the ability to set precedence constraints - so you can control which procedures are executed and what order they are executed.

    Along with this - you can also set them up to run parallel.  For example, if you have 3 procedures that can all run at the same time but all 3 need to be completed successfully before the 4th procedure is executed - you can create the constraints that way.

    Instead of running a single procedure - or a single task - I would break out each procedure and set them up as individual Execute SQL Tasks, with appropriate precedence constraints.  If any of the procedures can be run in parallel this could reduce overall execution time significantly.

    This sounds like a good idea.  Are you referring to this?

    https://learn.microsoft.com/en-us/sql/integration-services/control-flow/precedence-constraints?view=sql-server-ver16

    I have a few questions:

    1.  SP1 and SP2 both read data from Table A.  Can I have SP1 and SP2 run at the same time?  I am wondering does SP1 creates a lock on Table A b/c it needs to read from it?
    2. The reason why I need to run all the SPs in a new SP is so I can have a rollback happen if any of the SPs experience a failure.  Can I do this with your suggestion?  The reason why I need a rollback is....I have created a table that tracks the dates of the new data.  All of the SPs join on this table so that way only new data is considered for calculation.  Once the SPs are run without any errors then the table is truncated.  If the SPs fail then I have to make sure the table tracking new dates isn't truncated.  I will need those dates to re-run the SPs after the reason for the failure is investigated/resolved.

     

  • Yes - you definitely need to be using precedence constraints between tasks.

    1. Reads (select) don't block other reads - locks on a select could prevent inserts/updates/deletes on the table(s) but won't block selecting data.
    2. I would not worry about rolling back a table that tracks the dates.  Based on your description, I would put the truncate at the beginning of the process instead of the end of the process.  If any step fails, you would have all the rows in that table available for troubleshooting.  It would be cleared when the process is restarted - or if you need that data to 'skip' sections, you can have a restart parameter that is checked before truncating - then use the existing data.

    There are many ways to consider how to process - but running each procedure one at a time is going to be the slowest.

     

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • deleted post

    • This reply was modified 3 weeks ago by  water490.
  • Jeffrey Williams wrote:

    Yes - you definitely need to be using precedence constraints between tasks.

    1. Reads (select) don't block other reads - locks on a select could prevent inserts/updates/deletes on the table(s) but won't block selecting data.
    2. I would not worry about rolling back a table that tracks the dates.  Based on your description, I would put the truncate at the beginning of the process instead of the end of the process.  If any step fails, you would have all the rows in that table available for troubleshooting.  It would be cleared when the process is restarted - or if you need that data to 'skip' sections, you can have a restart parameter that is checked before truncating - then use the existing data.

    There are many ways to consider how to process - but running each procedure one at a time is going to be the slowest.

    Thank you for this.  I will take a look.

  • deleted post

    • This reply was modified 3 weeks ago by  water490.
  • deleted post

    • This reply was modified 3 weeks ago by  water490.

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply