Click here to monitor SSC
SQLServerCentral is supported by Redgate
Log in  ::  Register  ::  Not logged in

Update statistics in parallel

By Federico Iori,

To update statistics on a whole database, there is the T-SQL command, sp_updatestats. This T-SQL statement works serially on one table at a time, so it is not very fast. Sometimes it would be nice to update statistics on an entire database using not just one, but several threads. The solution is the SSIS package  UPDATE_STATS_PARALLEL.dtsx, which creates dynamically and launches another SSIS package. The second package contains as many independent Execute SQL Tasks as needed to update statistics in parallel. 

The UPDATE_STATS_PARALLEL.dtsx has several variables that are used as input parameters to control the behaviour of the generated SSIS package. These are:

INSTANCE_NAME: name of the SqlServer Instance

DEST_DB: name of the database to perform update statistics on , and of the dynamically generated SSIS package.

NUM_OF_THREADS: maximum number of threads of the generated dtsx

SOURCE_SELECT: The T-SQL statement that generates the list of tables to update statistics on. This must return two columns:  [tname] and [comando]. These are the name of the table and the command to be run on the table, respectively.  Here is one example :

 Update statistics on tables not empty  and with statistics older than 3 days
select as [tname]
, ' update statistics ['']  ' [comando]
 from sys.tables t
  left join sys.stats s 
   on t.object_id = s.object_id  
  join ( select OBJECT_ID 
              , sum(rows) rows 
          from  sys.partitions 
          group by object_id 
          having SUM(rows) >0
       )  pa
    on t.object_id = pa.object_id 
 where (  STATS_DATE(t.object_id,stats_id) is null 
       or datediff(DAY, STATS_DATE(t.object_id,stats_id), GETDATE() ) >=3        )
 group by
 order by 1

The criteria in the where clause can be modified as needed.

Here is picture of the UPDATE_STATS_PARALLEL.dtsx package

The most important part of UPDATE_STATS_PARALLEL.dtsx is the VB script named "Create DTSX". This script creates an SSIS package named  ValueOf(DEST_DB).dtsx. The structure of the code is similar to that shown in this article: Building Packages Programmatically. Another example is this thread: Dynamically create an SSIS Bulk Insert Package

The code used is shown below.

 Public Sub Main()

        Dim packageName As String = Dts.Variables("DEST_DB").Value.ToString

        app = New Application()

        Dts.Events.FireInformation(0, "Update Stats Package", packageName, "", 0, True)

        Dim destDbConnectionString As String = Dts.Variables("DEST_CONNECT").Value.ToString
      'Create package
        dataTransferPackage = New Package()
        dataTransferPackage.LocaleID = 1029

      'Create DestConection
        Dim destConn As Microsoft.SqlServer.Dts.Runtime.ConnectionManager = dataTransferPackage.Connections.Add("OLEDB")
        destConn.Name = "DestDB"
        destConn.ConnectionString = destDbConnectionString

        workDestConn = New ServerConnection(Dts.Variables("INSTANCE_NAME").Value.ToString)
        workDestConn.DatabaseName = Dts.Variables("DEST_DB").Value.ToString

        Dim tablesDS As DataSet = workDestConn.ExecuteWithResults(Dts.Variables("SOURCE_SELECT").Value.ToString)
        Dim tablesDT As DataTable = tablesDS.Tables(0)

     ' For each row in the dataset, add an ExecuteSQLTask that performs update statistics
        For Each tablesR As DataRow In tablesDT.Rows
            Dim sqlcomandosource As String = tablesR.Item("comando").ToString
            'Dts.Events.FireInformation(0, "Creating components for table", "[" + schemaName + "].[" + tableName + "]", "", 0, True)
            Dim UpdStatsTask As Executable = dataTransferPackage.Executables.Add("STOCK:SQLTask")
            Dim thError As Microsoft.SqlServer.Dts.Runtime.TaskHost = CType(UpdStatsTask, Microsoft.SqlServer.Dts.Runtime.TaskHost)
            thError.Name = "Stats " + tablesR.Item("tname").ToString
            thError.Properties("Connection").SetValue(thError, dataTransferPackage.Connections("DestDB").Name)
            thError.Properties("SqlStatementSourceType").SetValue(thError, 1)
            thError.Properties("SqlStatementSource").SetValue(thError, sqlcomandosource)
            thError.Properties("DelayValidation").SetValue(thError, True)
            ' nnn thError.Properties("TimeOut").SetValue(thError, CUInt(50))
            'Dts.Events.FireInformation(0, "Error task created for table", "[" + schemaName + "].[" + tableName + "]", "", 0, True)


        dataTransferPackage.MaxConcurrentExecutables = Dts.Variables("NUM_OF_THREADS").Value.ToString

        dataTransferPackage.MaximumErrorCount = 100

        dataTransferPackage.DelayValidation = True

        app.SaveToXml(packageName + ".dtsx", dataTransferPackage, Nothing)

        Dts.TaskResult = ScriptResults.Success
    End Sub

Here is a picture of how the dynamically generated SSIS package might look. The package as many independent Execute SQL Tasks as the rows in the result set from the variable SOURCE_SELECT.

If this generated package runs, it will execute simultaneously up to NUM_OF_THREADS of these tasks. It will be much faster, and also more resource intensive, than the built-in T-SQL command, sp_updatestats.

After creating this package, the master package, UPDATE_STATS_PARALLEL.dtsx, executes the package using an SSIS ExecutePackageTask. The master package then deletes the child package using an SSIS FileSystemTask.

In the attachment there is also a T-SQL stored procedure that can be used to launch UPDATE_STATS_PARALLEL.dtsx using a T-SQL command. This is an optional part of the proposed solution .



sp_Update_Statistics_Parallel_byDTSX.sql | UPDATE_STATS_PARALLEL.dtsx
Total article views: 5105 | Views in the last 30 days: 2
Related Articles

creating index and updating statistic

is it necessary to update statistic after creating an index ?


Update statistics

Update statistics older 1 day


Statistics update

Statistics update


Update Statistics Maint.plan Job

Update Statistics Maint.plan Job


Statistics Update Clarification

By default statistics are created automatically within SQL Server. And, by default, these stats are ...