Printed 2017/08/23 05:46PM

Better Know A SSIS Transform – The Sort Transform


This is part 2 of my 29 part series called Better Know A SSIS Transform.  Hopefully you will find the series informative.  I will tell you a little about each transform and follow it up with a demo basic you can do on your own.  Screenshots that go with examples can be viewed from my regular blog site.

When developing SSIS packages you are bound find out that some transforms require your data to be sorted before it can accept new input rows (Ex. Merge & Merge Join).  The Sort Transform may be one of the more frequently used transforms for that reason, but it really should be one of the most avoided. 

Why Avoid the Sort Transform?

Hopefully after you read this section you will agree with me that the new title of this blog should be  Better Know a SSIS Transform Better Not Know a SSIS Transform.

The Sort Transform is a fully blocking asynchronous transform, meaning no rows can pass this transform until all rows have been sorted.  If you take a minute to think about it that makes sense.  You cannot send rows further down the Data Flow until all the rows have been sorted, so the Sort Transform holds up the show. 

The best way to visualize this is to think of a deck of cards.  You have a Conditional Split that wants to split the face cards from the numbered cards.  Before that can happen you have a Sort Transform that will sort all the cards by numbers and face value.  None of the cards can reach the Conditional Split until all the cards have been sorted causing our package to stall at the Sort Transform until all the rows have been processed.

So we know now we want to avoid it when possible.  The obvious way to avoid it is by using TSQL in your source, but what if we have a flat file source where you can’t write a TSQL ORDER BY.  In this case we’re stuck using the Sort Transform.

How to Avoid

Strategies for Using

When you have no other option but to use the Sort Transform (Ex. a Flat File Source) then be sure to use it wisely.  Think about the order of your Data Flow.  If you already have an Aggregate Transform in the package that’s performing a Group By then make sure to use the Sort Transform is after the Aggregate Transform.  This way instead of sorting 100,000 rows and then grouping them into 440 rows you can group 100,000 rows into 440 rows and only have to sort 440 rows.  **The Aggregate Transform is also a asynchronous transform that should be avoided when possible**

Configuring the Sort Transform

These steps will walk you through a basic configuration of the Sort Transform.

Example Overview

Step 1: Configure Flat File Source

Step 2: Add Data Viewer

Step 3: Configure Sort Transfor

Step 4: Add Data Viewer and Destination

Copyright © 2002-2017 Redgate. All Rights Reserved. Privacy Policy. Terms of Use. Report Abuse.