Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 

Update statistics in parallel

By Federico Iori,

To update statistics on a whole database, there is the T-SQL command, sp_updatestats. This T-SQL statement works serially on one table at a time, so it is not very fast. Sometimes it would be nice to update statistics on an entire database using not just one, but several threads. The solution is the SSIS package  UPDATE_STATS_PARALLEL.dtsx, which creates dynamically and launches another SSIS package. The second package contains as many independent Execute SQL Tasks as needed to update statistics in parallel. 

The UPDATE_STATS_PARALLEL.dtsx has several variables that are used as input parameters to control the behaviour of the generated SSIS package. These are:

INSTANCE_NAME: name of the SqlServer Instance

DEST_DB: name of the database to perform update statistics on , and of the dynamically generated SSIS package.

NUM_OF_THREADS: maximum number of threads of the generated dtsx

SOURCE_SELECT: The T-SQL statement that generates the list of tables to update statistics on. This must return two columns:  [tname] and [comando]. These are the name of the table and the command to be run on the table, respectively.  Here is one example :

/*
 Update statistics on tables not empty  and with statistics older than 3 days
*/
select
  t.name as [tname]
, ' update statistics ['+t.name+']  ' [comando]
 from sys.tables t
  left join sys.stats s 
   on t.object_id = s.object_id  
  join ( select OBJECT_ID 
              , sum(rows) rows 
          from  sys.partitions 
          group by object_id 
          having SUM(rows) >0
       )  pa
    on t.object_id = pa.object_id 
 where (  STATS_DATE(t.object_id,stats_id) is null 
       or datediff(DAY, STATS_DATE(t.object_id,stats_id), GETDATE() ) >=3        )
 group by t.name
 order by 1

The criteria in the where clause can be modified as needed.

Here is picture of the UPDATE_STATS_PARALLEL.dtsx package

The most important part of UPDATE_STATS_PARALLEL.dtsx is the VB script named "Create DTSX". This script creates an SSIS package named  ValueOf(DEST_DB).dtsx. The structure of the code is similar to that shown in this article: Building Packages Programmatically. Another example is this thread: Dynamically create an SSIS Bulk Insert Package

The code used is shown below.

 Public Sub Main()

        Dim packageName As String = Dts.Variables("DEST_DB").Value.ToString

        app = New Application()

        Dts.Events.FireInformation(0, "Update Stats Package", packageName, "", 0, True)

        Dim destDbConnectionString As String = Dts.Variables("DEST_CONNECT").Value.ToString
  
      'Create package
        dataTransferPackage = New Package()
        dataTransferPackage.LocaleID = 1029


      'Create DestConection
        Dim destConn As Microsoft.SqlServer.Dts.Runtime.ConnectionManager = dataTransferPackage.Connections.Add("OLEDB")
        destConn.Name = "DestDB"
        destConn.ConnectionString = destDbConnectionString

        workDestConn = New ServerConnection(Dts.Variables("INSTANCE_NAME").Value.ToString)
        workDestConn.DatabaseName = Dts.Variables("DEST_DB").Value.ToString

        
        Dim tablesDS As DataSet = workDestConn.ExecuteWithResults(Dts.Variables("SOURCE_SELECT").Value.ToString)
        Dim tablesDT As DataTable = tablesDS.Tables(0)

     ' For each row in the dataset, add an ExecuteSQLTask that performs update statistics
        For Each tablesR As DataRow In tablesDT.Rows
            Dim sqlcomandosource As String = tablesR.Item("comando").ToString
            'Dts.Events.FireInformation(0, "Creating components for table", "[" + schemaName + "].[" + tableName + "]", "", 0, True)
            Dim UpdStatsTask As Executable = dataTransferPackage.Executables.Add("STOCK:SQLTask")
            Dim thError As Microsoft.SqlServer.Dts.Runtime.TaskHost = CType(UpdStatsTask, Microsoft.SqlServer.Dts.Runtime.TaskHost)
            thError.Name = "Stats " + tablesR.Item("tname").ToString
            thError.Properties("Connection").SetValue(thError, dataTransferPackage.Connections("DestDB").Name)
            thError.Properties("SqlStatementSourceType").SetValue(thError, 1)
            thError.Properties("SqlStatementSource").SetValue(thError, sqlcomandosource)
            thError.Properties("DelayValidation").SetValue(thError, True)
            ' nnn thError.Properties("TimeOut").SetValue(thError, CUInt(50))
            'Dts.Events.FireInformation(0, "Error task created for table", "[" + schemaName + "].[" + tableName + "]", "", 0, True)

        Next


        dataTransferPackage.MaxConcurrentExecutables = Dts.Variables("NUM_OF_THREADS").Value.ToString

        dataTransferPackage.MaximumErrorCount = 100

        dataTransferPackage.DelayValidation = True


        app.SaveToXml(packageName + ".dtsx", dataTransferPackage, Nothing)

        Dts.TaskResult = ScriptResults.Success
    End Sub


Here is a picture of how the dynamically generated SSIS package might look. The package as many independent Execute SQL Tasks as the rows in the result set from the variable SOURCE_SELECT.

If this generated package runs, it will execute simultaneously up to NUM_OF_THREADS of these tasks. It will be much faster, and also more resource intensive, than the built-in T-SQL command, sp_updatestats.

After creating this package, the master package, UPDATE_STATS_PARALLEL.dtsx, executes the package using an SSIS ExecutePackageTask. The master package then deletes the child package using an SSIS FileSystemTask.

In the attachment there is also a T-SQL stored procedure that can be used to launch UPDATE_STATS_PARALLEL.dtsx using a T-SQL command. This is an optional part of the proposed solution .

Resources:

sp_Update_Statistics_Parallel_byDTSX.sql | UPDATE_STATS_PARALLEL.dtsx
Total article views: 4821 | Views in the last 30 days: 4
 
Related Articles
FORUM

creating index and updating statistic

is it necessary to update statistic after creating an index ?

FORUM

Statistics update

Statistics update

FORUM

Update Statistics Maint.plan Job

Update Statistics Maint.plan Job

BLOG

Statistics Update Clarification

By default statistics are created automatically within SQL Server. And, by default, these stats are ...

FORUM

Reindex and Update Statistics History

Reindex and Update Statistics History

 
Contribute

Join the most active online SQL Server Community

SQL knowledge, delivered daily, free:

Email address:  

You make SSC a better place

As a member of SQLServerCentral, you get free access to loads of fresh content: thousands of articles and SQL scripts, a library of free eBooks, a weekly database news roundup, a great Q & A platform… And it’s our huge, buzzing community of SQL Server Professionals that makes it such a success.

Join us!

Steve Jones
Editor, SQLServerCentral.com

Already a member? Jump in:

Email address:   Password:   Remember me: Forgotten your password?
Steve Jones