Performance - SSIS Package vs .NET Application

  • Hi,

    I was working on revamping one of my packages. The package basically connects to more than 1000 SQL Servers to perform certain operations based on a configuration file. Performance was not exactly an issue but I wanted to improve it in the new version. I wend asking for suggestions and tried many ways to get significant improvements. That was when a colleague suggested creating a .NET application instead of a package. I ruled out the possibility instantly due to the initial work involved and challenges in configuring the application.

    But after trying out all the alternatives, I went ahead and wrote a simple application for testing purpose. It wasn't configurable and performed just one action on all the servers. To my surprise, I saw an improvement beyond 50%. The job which my package took 30 minutes to complete was completed in just 7 minutes.

    Now when I weigh the pros and cons, a .NET application does sound as a feasible option if you're ready to work out the initial effort required. Here's how I weighed it:

    Development

    SSIS -Lesser time, easily configurable, readily available architecture

    App- Longer time, requires building an entire framework for making it configurable

    Winner - SSIS

    Deployment Environment

    SSIS - Require SQL Sever for basic functions. For advance functions, also require SSIS installed

    App - Require .NET framework

    Winner - App

    Performance

    SSIS - Slower

    App - Faster if done correctly

    Winner - App

    It could be possible that my requirement is one of its kind or my package is not optimized. But from what I saw in terms of performance, is it time to build an application architecture for .net applications?

    Please share your thoughts...

    Cheers,

    Sid

  • My first thought was that the discussion about a change in methodology might be premature. Have you looked at exactly where the package is taking longer than your test .net app? I could see a bit of overhead but the numbers you are talking are well outside the range I would expect. Also are the .net app and the package really doing the same stuff? I've seen a post before where the .net app didn't really do everything the package did and it ran a lot faster. When the code was changed to do everything it then ran a lot slower, mileage may vary. I guess a lot depends on what you are doing. If you are copying data from one server to another then .net will rarely be as fast or faster, SSIS is optimized to do that and I've seen amazing amounts of data moved on really underpowered hardware. So you might expound on what you are actually doing..

    CEWII

  • Thanks for your reply Elliott.

    The package is basically performing the task of data comparison / data synchronization. For each run, it does so for approximately 1200 servers. It basically compares the data of a base server to a child server and copies over the missing rows. We had to use this option as it is required to run on ad-hoc basis and replication is not an option. Based on the parameters passed, it decides the table(s) required for comparison. Then it compares the primary keys of tables in both the servers and copies the data which is not present in child server. Apart from performing the actual task, it also performs logging so at the end of day, you can get a report on which servers were synchronized successfully and which failed.

    I did notice that the data transfer was quite fast. But the time it consumed when switching between servers was the one which created a huge difference. The package takes anywhere between 1.5-4 seconds to switch server. When you consider the server count, it amounts to a lot of time.

    Thanks,

    Sid

  • Are you running the app on the same server as the SSIS package runs? Also, what is your comparison methodology? Another thought is you probably have a connection manager to the local database that never changes, you might consider setting the "keep connection" box for it so that logout/login cycles are minimized.

    CEWII

  • Yes Elliott, both of them are running on the same server. As the local server is a production machine, I'm not querying the data for each comparison. I fetch the data from the server at one shot and store it in raw files, which act as buffer.

    For comparing the data, I do a left outer join between the base server and child server. Then I use a conditional split to get the output where the primary key of child data is null. This output then goes into the destination, which is the child server.

    Thanks,

    Sid

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply