How to simulate Case statement in Azure Data Factory (ADF) compared with SSIS?

,

This post will describe how you use a CASE statement in Azure Data Factory (ADF). If you are coming from SSIS background, you know a piece of SQL statement will do the task. However let's see how do it in SSIS and the very same thing can be achieved in ADF.

Problem statement

For my simple scenario, when the PortfolioTypeCode has a value of either 'Mutual Fund' or 'Pooled Fund', my package should return a 1, otherwise it should return 0.

In SSIS, under data flow you will have an OLEDB source, like shown below in Fig 1.

Fig 1: SSIS OLEDB source

We then open the OLEDB source and then write a T-SQL command, like the one below and we are done:

SELECT Col1,
          CASE WHEN PortfolioCode IN('Mutual fund','Pooled fund')
            THEN 1
            ELSE 0
END  IsFund,
Col2
From Table1
Fig 2: CASE WHEN under SQL command in SSIS

How do you implement this in ADF?

In ADF, to achieve the same goal, you need to use Expressions. ADF has the very same concept of a data flow, like SSIS. In the data flow, after the source dataset is established you can add a 'Derived Column' activity, shown below in Fig 3:

Fig 3: Adding a derived column

Now you can give a the new column a name and then add the expression shown in Fig 4.

Fig 4: Derived column expression

 

As you have seen at the beginning of the article When you write CASE statement in TSQL , your syntax look like below:

CASE WHEN PortfolioCode IN('Mutual fund','Pooled fund')
            THEN 1
         ELSE 0
END  IsFund

 

However, when it comes to ADF expression, then it needs to translate like below:

  case( PortfolioTypeCode=='Mutual Fund',1,
       PortfolioTypeCode=='Pooled Fund',1,0)

 

Let's talk about in details about case expression in ADF, the case expression takes 3 arguments:  case(condition,true_expression,false_expression). From the above code, when  PortfolioTypeCode=='Mutual Fund', if the condition is True, then it will return 1. If this is false, we have added another condition: PortfolioTypeCode=='Pooled Fund'. If this condition is True, then the expression will return 1. If both conditions are false, then the expression will return 0.

We have explained the case expression above and the below Fig 5 shows, how the code look like in ADF.

Fig 5: case expression in ADF

In SSIS whenever you work with data flow and choose OLEDB source, you can write T-SQL code since underneath it's SQL engine, however; for ADF data flow, spark engine is running behind the scene. ADF data flow expression is pretty much powerful, please find more details about the case expression from Microsoft document.

 

 

Rate

4.5 (2)

Share

Share

Rate

4.5 (2)