SSIS - New Implementation

  • Hello guys,

    We are trying to implement SSIS as an ETL tool (new to our environment). Is there any document / best practices I should follow to set up SSIS.

    Any help/guidance on this is much appreciated.

    Many thanks!

  • Define 'set up' and 'best practices'.

    If you're saying that your company is going to start using SSIS and would like to know if there's any kind of guidance on how to do ETL then I'm going to guess the answer is 'not really'. Every company has their own processes and even within the same one there are differences depending on the area writing the code.

    Some things that I personally try to jam through when I'm building new processes/packages.

    Naming standards. Whatever works for your company but my preference is 'Grouping_Process' as in Contact_Extract. That way all the packages for a process hang together.

    Parameters. I'm a big fan of parameters. Create a parameter table and have a package that sets them for the incremental load. Then create a stored procedure to overwrite them. I can go into more detail how I handle that one but the result is the ability to gap fill and rerun by setting the parameter and requesting the job be run and no code is touched.

    Logging. Find a way to log what's important to you. If you're using the catalog there will be reports but they're not very open for viewing. I built my own process since we want to catch things that aren't always part of the things SSIS or SQL want to log.

    Embedded code vs stored procedures. Putting your SQL into stored procedures means using SSIS more as a control flow than the tool but it also makes your code accessible and changeable without having to open the package. The downside is that you've got a bunch of stored procedures that need to go with the package along with the rights to execute them.

    If you're interested in more detail about how I built out the current framework and templates for our ETL I'll gladly write a lot more and give examples.

  • Thanks a lot Marie. That was so informative.

    How should I handle when we deploy the packages from dev to another environment. Any best practices in that area. I have heard of ssisdb. Do I need to worry about it? Or can i use import to msdb folder and schedule using sql agent.

    It would be great if you could provide some details on template to set up for new environments.

  • There's 2 deployment models starting with version 2012.

    Package - Every package is a separate entity

    Project - The entire project is deployed to the SSIS Catalog

    They both have advantages and disadvantages. I recommend you do your own research on these and see how they fit into your environment.

    SSIS packages can be stored in msdb or they can be stored on a share/drive. Either way they can be run as SQL jobs or as command line jobs.

    Ah. My template. Get yourself a beverage of choice and get comfortable. 😎 I'm only going to hit the high points here.

    Stored procedures:

    Audit Log Insert

    Parameter Insert

    Parameter Manual Insert

    Dynamic Table Truncate

    Packages:

    Template_Parameter

    Template_Extract

    Template_Transform (as needed)

    Template_Load

    Template_Notify

    Processes:

    Audit Log - This logs every step of the package process, the number of records copied, critical events, and anything else we feel worth of logging. It can include a LOT of audit events.

    Set the parameters using the destination data to find out your extract increment. Decide how you want to handle the end date/time. If you're not cutting off at midnight I recommend subtracting a second or so from the current time to take into account records that are in the process of being written. Save those into your parameter table.

    Gather up the parameters in the Extract package, truncate the staging table, and copy the data over. I saved myself a lot of stored procedures by making my truncate a stored procedure using dynamic SQL. I can get away with this because the table name is a package user variable.

    If necessary do any major transformations in their own package. Sure they could be done in the extract but for clarity it's easier to see what's being done if there's a package that has just that part. I like bite size steps.

    Load the data from staging to production. This package does minimal logging and mostly executes stored procedures since T-SQL is a lot more flexible and faster for this kind of work in my opinion.

    The final step is the notification if anything kicked up a critical event flag. I've got event handlers all over but these can also be business rules that might have an impact. Examples are no records copied, too many/few records, data testing, etc. Anything you would consider critical. If there are it sends a nice HTML formatted message with the audit log entries.

    It's a lot to make in the template. But once it's done using it is really simple. An extract requires the table name variable and a data flow. Other than that there's nothing to code unless there's something odd.

    I use the parameter table for more than just the ETL. The values for sending the email are in there and some of the maintenance processes have entries. It's a handy thing to have at your disposal.

    If you've never used SSIS before this is a lot to take in. Using the wizard and adding on a few tasks is simple but you're going to want more as you mature the processes.

  • It seems you're just getting started as a DBA administering SSIS and already looking for general advice on how best to approach it. So, that's good. At this point maybe you can start some good books, one as an introduction and one on design patterns.

    Knight's Microsoft SQL Server 2012 Integration Services 24-Hour Trainer

    http://www.amazon.com/Knights-Microsoft-Integration-Services-24-Hour/dp/1118479580

    SQL Server Integration Services Design Patterns

    http://www.amazon.com/Server-Integration-Services-Design-Patterns/dp/1484200837/

    David Peter Hansen does some great conference presentations on SSIS performance, configuration, and operations from the perspective of a DBA.

    https://www.youtube.com/channel/UCIjPbfMxM8SWdms2kaofudw

    I myself am currently in the process of designing a large scale SSIS server to consolidate and standardize what our organization is currently using (about 30 legacy SSIS instances scattered about ranging from 2005 - 2012). Below is a dump of some of some MSDN and other source material related to configuring the network, hardware, and disk system to optimize the runtime environment of SSIS.

    http://thinknook.com/8-ways-to-optimize-your-ssis-package-2013-06-28/

    http://blogs.msdn.com/b/sqlcat/archive/2011/09/27/maximizing-sql-server-throughput-with-rss-tuning.aspx

    http://blogs.msdn.com/b/sqlcat/archive/2013/09/16/top-10-sql-server-integration-services-best-practices.aspx

    http://blogs.msdn.com/b/sqlperf/archive/2007/05/01/set-blobtempstoragepath-and-buffertempstoragepath-to-fast-drives.aspx

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Thanks a lot!

  • Thanks a lot!

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply