To update statistics on a whole database, there is the T-SQL command, sp_updatestats. This T-SQL statement works serially on one table at a time, so it is not very fast. Sometimes it would be nice to update statistics on an entire database using not just one, but several threads. The solution is the SSIS package UPDATE_STATS_PARALLEL.dtsx, which creates dynamically and launches another SSIS package. The second package contains as many independent Execute SQL Tasks as needed to update statistics in parallel.
The UPDATE_STATS_PARALLEL.dtsx has several variables that are used as input parameters to control the behaviour of the generated SSIS package. These are:
INSTANCE_NAME: name of the SqlServer Instance
DEST_DB: name of the database to perform update statistics on , and of the dynamically generated SSIS package.
NUM_OF_THREADS: maximum number of threads of the generated dtsx
SOURCE_SELECT: The T-SQL statement that generates the list of tables to update statistics on. This must return two columns: [tname] and [comando]. These are the name of the table and the command to be run on the table, respectively. Here is one example :
Update statistics on tables not empty and with statistics older than 3 days
t.name as [tname]
, ' update statistics ['+t.name+'] ' [comando]
from sys.tables t
left join sys.stats s
on t.object_id = s.object_id
join ( select OBJECT_ID
, sum(rows) rows
group by object_id
having SUM(rows) >0
on t.object_id = pa.object_id
where ( STATS_DATE(t.object_id,stats_id) is null
or datediff(DAY, STATS_DATE(t.object_id,stats_id), GETDATE() ) >=3 )
group by t.name
order by 1
The criteria in the where clause can be modified as needed.
Here is picture of the UPDATE_STATS_PARALLEL.dtsx package
The most important part of UPDATE_STATS_PARALLEL.dtsx is the VB script named "Create DTSX". This script creates an SSIS package named ValueOf(DEST_DB).dtsx. The structure of the code is similar to that shown in this article: Building Packages Programmatically. Another example is this thread: Dynamically create an SSIS Bulk Insert Package.
The code used is shown below.
Public Sub Main()
Dim packageName As String = Dts.Variables("DEST_DB").Value.ToString
app = New Application()
Dts.Events.FireInformation(0, "Update Stats Package", packageName, "", 0, True)
Dim destDbConnectionString As String = Dts.Variables("DEST_CONNECT").Value.ToString
dataTransferPackage = New Package()
dataTransferPackage.LocaleID = 1029
Dim destConn As Microsoft.SqlServer.Dts.Runtime.ConnectionManager = dataTransferPackage.Connections.Add("OLEDB")
destConn.Name = "DestDB"
destConn.ConnectionString = destDbConnectionString
workDestConn = New ServerConnection(Dts.Variables("INSTANCE_NAME").Value.ToString)
workDestConn.DatabaseName = Dts.Variables("DEST_DB").Value.ToString
Dim tablesDS As DataSet = workDestConn.ExecuteWithResults(Dts.Variables("SOURCE_SELECT").Value.ToString)
Dim tablesDT As DataTable = tablesDS.Tables(0)
' For each row in the dataset, add an ExecuteSQLTask that performs update statistics
For Each tablesR As DataRow In tablesDT.Rows
Dim sqlcomandosource As String = tablesR.Item("comando").ToString
'Dts.Events.FireInformation(0, "Creating components for table", "[" + schemaName + "].[" + tableName + "]", "", 0, True)
Dim UpdStatsTask As Executable = dataTransferPackage.Executables.Add("STOCK:SQLTask")
Dim thError As Microsoft.SqlServer.Dts.Runtime.TaskHost = CType(UpdStatsTask, Microsoft.SqlServer.Dts.Runtime.TaskHost)
thError.Name = "Stats " + tablesR.Item("tname").ToString
' nnn thError.Properties("TimeOut").SetValue(thError, CUInt(50))
'Dts.Events.FireInformation(0, "Error task created for table", "[" + schemaName + "].[" + tableName + "]", "", 0, True)
dataTransferPackage.MaxConcurrentExecutables = Dts.Variables("NUM_OF_THREADS").Value.ToString
dataTransferPackage.MaximumErrorCount = 100
dataTransferPackage.DelayValidation = True
app.SaveToXml(packageName + ".dtsx", dataTransferPackage, Nothing)
Dts.TaskResult = ScriptResults.Success
Here is a picture of how the dynamically generated SSIS package might look. The package as many independent Execute SQL Tasks as the rows in the result set from the variable SOURCE_SELECT.
If this generated package runs, it will execute simultaneously up to NUM_OF_THREADS of these tasks. It will be much faster, and also more resource intensive, than the built-in T-SQL command, sp_updatestats.
After creating this package, the master package, UPDATE_STATS_PARALLEL.dtsx, executes the package using an SSIS ExecutePackageTask. The master package then deletes the child package using an SSIS FileSystemTask.
In the attachment there is also a T-SQL stored procedure that can be used to launch UPDATE_STATS_PARALLEL.dtsx using a T-SQL command. This is an optional part of the proposed solution .