ETL Performance Auditing - Part 1: Introduction to ETL Auditing

  • Comments posted to this topic are about the item ETL Performance Auditing - Part 1: Introduction to ETL Auditing

    Frank Banin
    BI and Advanced Analytics Professional.

  • From the Article


    The problem is [font="Arial Black"]there are over 300 individual child SSIS packages [/font]that ...

    ...{snip}...

    For bigger ETL setups with many of packages and tasks, you might need a system that loops through all your packages to obtain this information. For instance you can loop through the .dtsx files with XML or use C# or VB and the SISS object Model to achieve this objective, [font="Arial Black"]a topic I will leave for another discussion[/font].

    Good article but I'm really hoping that "another discussion" will be one of the two other parts of this series because something that does the above is really going to be important to someone who has "over 300 individual child SSIS packages".

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • You say, "By the the end of this Part you should have enabled logging using the SQL Server log Provider, and also gathered the Names and IDs of your packages and tasks."

    You do not provide in this first article how to use and set the suggested logging modes for Pre, Post, etc nor do you explain using SQL Server log Provider. Nor do you clearly explain where the Package ID tables would be nor how to find them. I suspect you are assuming that the Packages in use are stored in a database and not on the file system, but I cannot tell from your article.

    Perhaps I missed the above descriptions in your article, and did you say which version of SSIS are you using as the basis of this article?

    Thanks,

    Kevin

  • To Jeff Moden point, I will discuss how to obtain package information for cases where people may have a lot of Packages. The reason I wanted to discuss the approach I employ separately is because I use the SSIS object model and C# which in itself require some explanations, I just did not want to take the focus away from what I was trying to achieve with this series.

    Kevin to your concern:

    In this article "Serving Warm SSIS Errors" on SQL Server Central you can check out the “Enabling SSIS logging with SQL Server log Provider” section for details on how to setup the suggested logging mode. After you’ve enabled logging it does not matter where you deploy your packages. As for the Package ID tables as you see later all you need to be able to join the it to the SSIS logging table into which you enabled your logging.

    Frank Banin
    BI and Advanced Analytics Professional.

  • A very promising beginning. I have been employing the exact approach you mentioned: having individual audit tasks attached to the tasks in the package, but, as you rightfully pointed out, this gets complicated very quickly as complexity of the packages increases. I look forward to learning how to do this more efficiently.

  • Very interesting indeed! I have high hopes for the next in the series.

    Will be nice to know the version of SSIS or if its equal in all the versions.

    Looking forward to read the next part

  • This series comes at a great time for me as I am currently "between cubes" and in more of a refinement stage (ok, Cleanup, but also making tweaks as needed). We have some auditing in place, but only for our custom components while some of what we need to know is how long other components are taking.

    Can't wait!!

  • I liked your article. I found something similar on Pragmatic works website. Just posting here as an added material 🙂 The link is http://pragmaticworks.com/Products/Business-Intelligence/BIxPress/ssis-logging-auditing-monitoring.aspx.

    The article talks about 2 types of the most implemented methods of logging. Then, it gives preview of the tool (not free) which would help in logging multiple packages and different attributes of the packages.

    I am posting here just FYI.

  • I think one important thing to note is that you would typically set up a template package with all of your variables and package configs all set up. By copying and pasting this package you are able to quickly build up your ETL.

    However, when using this method in conjuction with the SourceID attribute as mentioned you will need to explicitly generate a new SourceID for each package you copy and paste based on the template. Otherwise, all packages will have the same SourceIDs.

  • I believe using the pre and post execute event handlers to call SPs in order to write to a auditing table is the most efficient and and less invasive method of auditing, so I would be interested to see if this is covered in subsequent parts.

  • Hi Frank,

    Have you posted an article for how to loop through packages and extract package and task GUIDS?

    I kind of built my own to search for text within the package and pull out values for the

    DTS:DTSID

    and the components' refId and componentClassId.

    I'm sure there's an easier way by looping and mounting the file to an object variable and using the object model in a script task to extract and write out the information I'm going to make an attempt at this and thought I'd ask to see if you have already got a code-block to share. 🙂

    Cheers,

    Fahim.

    ** UPDATE **

    For those using SQL 2012 (should be most of us by now), take a look at Reza's posts about the SSISDB Catalog.

    http://www.rad.pasfu.com/index.php?/categories/7-SSIS-Catalog

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply