Loading partitions incrementally using SSIS

Shubs, 2019-06-28 (first published: 2017-10-13)

In the article, Keeping Fact Tables Online while Loading via Partition Switching¹, we discussed that for very large fact tables where each partition is very large, it is not practical to load partitions completely. Hence for these tables, you would want to switch only the partitions which are expected to receive new data to a secondary table, load the new data in the required partitions in the secondary table and switch the partition back to the main table (fig 1). In this article we would discuss how this can be implemented in SSIS.

I have created 2 target tables - dbo.TGT_tablename_tbl and shadow.TGT_tablename_tbl in dbo and shadow schema respectively (query attached). dbo.TGT_swap_tbl is the primary fact table used for reporting purposes. Table in shadow schema is empty at the start of the process and has a same structure as the primary table in the dbo schema.

The process in fig 1 was implemented in SSIS as follows:

Task1: Here we get the latest insert date from the target table (fig 3). In this example we are assuming that source data does not get updated but only new data is added. We are hence using maximum date from the target table to set variable v_max_date (fig 4) in order to build a query that extracts data post this date.

Task2: This Execute SQL Task is executed, when the target table is not empty. The precedence constraint set in Flow 1 evaluates the v_max_date variable (fig 5). In task 2, we are ensuring that the table in shadow schema is empty, by truncating it (fig 6).

Task 3: In this Execute SQL task, we are switching partition out from dbo schema to shadow schema (step 2). As you can see that in step 1 we are getting the partition number of the partition based on the variable v_max_date. By only pulling in the required partitions in shadow schema, we are only bringing in the data for the partition associated with new rows and keeping historical data intact in the primary table.

Task 4: In this data flow task, we are pulling the incremental data based on v_max_date and loading it into table in shadow schema. As you can see we are constructing the query dynamically through a v_sql_qry variable based on the value set for the v_max_date variable.

Task 5: Here we are switching partitions that have been loaded from table in shadow schema to dbo schema.

Task 6: This data flow task is executed when the target table in dbo schema is empty. This is specifically used during the initial load.

Conclusion:

In this approach, we are making only the latest partitions unavailable during the load process thereby keeping the primary table available for querying historical data. Once the data has been loaded for existing or new partitions, switching process is initiated which is very fast and restores the primary table to its original state.

References

1. http://www.sqlservercentral.com/articles/SQL+Server/149123/

Keeping tables online during loading with schema swapping using SSIS

by Shubs

SQLServerCentral.com

Integration Services (SSIS)

In this article, we discuss how schema swapping method can be used to keep tables online during the load process.

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

3.6 (10)

You rated this post out of 5. Change rating

2019-08-02 (first published: 2017-12-05)

7,495 reads

Discuss

Loading partitioned table incrementally using SSIS

by Shubs

SQLServerCentral.com

Partitioning

This article gives an example of loading partitioned tables incrementally using SSIS

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

5 (3)

You rated this post out of 5. Change rating

2019-08-30 (first published: 2017-11-02)

6,327 reads

Discuss

Incremental Package Deployment – A SSIS 2016 Feature

by Shubs

SQLServerCentral.com

Integration Services (SSIS)

This article provides step by step instructions to deploy individual SSIS packages in a project deployment model.

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

4.78 (9)

You rated this post out of 5. Change rating

2020-02-07 (first published: 2017-08-29)

6,608 reads

Discuss

Parse Data from a Field Containing Multiple Values using CROSS APPLY

by Stan Kulp

SQLServerCentral.com

It is possible for a field in a character-delimited text file to contain a list of further-delimited values instead of the customary single value. This article demonstrates how to load such a file into a staging table, then use a CROSS APPLY query to parse the list of values into a related table.

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

3.27 (15)

You rated this post out of 5. Change rating

2019-06-07 (first published: 2017-10-30)

13,050 reads

Discuss

How to deploy and execute an SSIS package from the SSISDB catalog

by Stan Kulp

SQLServerCentral.com

Integration Services (SSIS)

Beginning with SQL Server 2012, SQL Server Integration Services packages can be deployed and executed from a SQL Server database named SSISDB, which serves as a repository for SSIS packages.

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

4.77 (22)

You rated this post out of 5. Change rating

2021-03-19 (first published: 2016-05-24)

28,347 reads

Discuss

Loading partitions incrementally using SSIS

Rate

Share

Categories

Share

Rate

Loading partitions incrementally using SSIS

Rate

Share

Categories

Share

Rate

Related content

Keeping tables online during loading with schema swapping using SSIS

Loading partitioned table incrementally using SSIS

Incremental Package Deployment – A SSIS 2016 Feature

Parse Data from a Field Containing Multiple Values using CROSS APPLY

How to deploy and execute an SSIS package from the SSISDB catalog

Cookies on SQLServerCentral