SQLServerCentral Article

Using a TSQL semaphore in SSIS

,

In computer science, a semaphore is a variable or abstract data type that provides a simple but useful abstraction for controlling access by multiple processes to a common resource in a parallel programming environment. As long as we are working with a RDBMS ( Relational DataBase Management System), the simplest way to implement a semaphore is to create a specific table and to use the lock mechanism, provided by the RDBMS, on the whole semaphore table or on a row of the semaphore table.

This article, Implementing a T-SQL semaphore,  shows how a semaphore is raised inside a T-SQL script. Now we can show how to use this principle in SSIS. We are going to show how to acquire a semaphore before running a Data Flow task and how to release it after the Data Flow is completed .

First, let us create the needed tables :

-- Create semaphore table drop_fi_semaforo 
use dwhsk_stage_prod
go
drop table drop_fi_semaforo
create table drop_fi_semaforo ( uno char(1) ,
constraint pk_drop_fi_semaforo primary key clustered ( uno) on dwhsk_stage_data
) on dwhsk_stage_data
/* The table will contain 3 semaphores, 
but for our example one will be enough */insert into drop_fi_semaforo values ( '1')
insert into drop_fi_semaforo values ( '2')
insert into drop_fi_semaforo values ( '3')
-- Create table to perform a long running bulk insert on 
USE [DWHSK_TARGET_PROD]
GO
-- drop table [drop_fi_POP_PROVA]
CREATE TABLE [dbo].[drop_fi_POP_PROVA](
    [Event_ID] [bigint] NOT NULL,
    [Create_Time] [datetime] NOT NULL,
    [Current_Flag] [bit] NULL,
    [File_ID] [int] NULL,
    [Network_Start_Time] [datetime] NOT NULL,
    [Start_Time] [datetime] NOT NULL,
    [Contract_ID] [int] NOT NULL,
    [Customer_ID] [int] NOT NULL,
    [Target_Customer_ID] [int] NOT NULL,
    [Network_ID] [smallint] NOT NULL)
 ON [PS_PORAEV]([Network_Start_Time])
WITH
(
DATA_COMPRESSION = PAGE
)

Next we create a new SSIS package. The SSIS package and its components must have the property "Isolation Level" set to "Serializable".

We  add to the package a Sequence Container, named for example SC,  with the TransactionOption property set to "Required".  We then add into the Sequence Container a new ExecuteSQLTask named GetSemaphore, with TransactionOption=Required, that tries to acquire the semaphore using this TSQL statement:

/* Acquiring the semaphore means simply putting a lock 
on a specific row of table drop_fi_semaforo : no data is modified.
*/begin tran
 select *
  from drop_fi_semaforo with ( rowlock, UPDLOCK, holdlock)   
  where Uno = '1'

We then add a DataFlow Task that must have the property  TransactionOption set to "NotSupported". This task  performs a long running bulk insert, into table drop_fi_POP_PROVA, in a separate session and a separate transaction from the session that holds the lock.

When the DataFlow completes, an ExecuteSQLTask named Release Semaphore, with property TransactionOption set to Required, resumes the transaction opened by GetSemaphore and releases the semaphore simply issuing a commit.

The SSIS package will look like this way :

To display the locks during the running of the package, we can use the following T-SQL statement:

/*
The semaphore table is named drop_fi_semaforo. 
The table, drop_fi_POP_PROVA, 
is the table on which runs the long running bulk insert 
performed in DataFlow
*/select o.name table_name , l.request_mode lock_type , l.request_type
, l.request_session_id
, coalesce(tt.transaction_id , tt2.transaction_id ) transaction_id ,
coalesce(tt.database_transaction_log_bytes_reserved , tt2.database_transaction_log_bytes_reserved ) transaction_log_use_bytes
 from sys.dm_tran_locks l
join ( select * from ( select * from dwhsk_stage_prod.sys.objects
union ALL
select * from dwhsk_target_prod.sys.objects ) oo
where name in ( 'drop_fi_semaforo' , 'drop_fi_POP_PROVA'  )
) o
on l.resource_associated_entity_id = o.object_id
left join sys.dm_tran_session_transactions t on l.request_session_id = t.session_id
and l.request_session_id > 0
left join sys.dm_tran_database_transactions tt
on t.transaction_id = tt.transaction_id
and tt.database_id = l.resource_database_id and l.request_session_id > 0
left join sys.dm_tran_database_transactions tt2
on tt2.transaction_id = request_owner_id

The locks when Get Semaphore has completed and Data Flow has not begun will be :

table_name lock_type request_type request_session_id transaction_id transaction_log_use_bytes
drop_fi_semaforo IX LOCK 80 202 0

The locks when Get Semaphore has completed and Data Flow is running will be :

table_name lock_type request_type request_session_id transaction_id transaction_log_use_bytes
drop_fi_POP_PROVA IX LOCK 83 201 643050
drop_fi_semaforo IX LOCK 80 202 0

We see that the session of the DataFlow runs in a different transaction from the transaction that is holding the semaphore, and that holding the semaphore does not use any log space.

When the package finishes to run, all locks will be released.

In conclusion, we have showed how to implement a semaphore that can be acquired by a generic DataFlow of a SSIS package, using a SequenceContainer appropriately designed.

Resources

Rate

2.94 (18)

You rated this post out of 5. Change rating

Share

Share

Rate

2.94 (18)

You rated this post out of 5. Change rating