Click here to monitor SSC
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Update statistics in parallel


Update statistics in parallel

Author
Message
Federico Iori
Federico Iori
Valued Member
Valued Member (51 reputation)Valued Member (51 reputation)Valued Member (51 reputation)Valued Member (51 reputation)Valued Member (51 reputation)Valued Member (51 reputation)Valued Member (51 reputation)Valued Member (51 reputation)

Group: General Forum Members
Points: 51 Visits: 300
Comments posted to this topic are about the item Update statistics in parallel
Pedro Figueiredo-203835
Pedro Figueiredo-203835
Forum Newbie
Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)

Group: General Forum Members
Points: 1 Visits: 147
I've corrected the SQL to include the schema name:


SELECT t.name AS [tname] ,
' update statistics ' + sch.name + '.[' + t.name + '] ' [comando]
FROM sys.tables t
INNER JOIN sys.schemas sch ON sch.schema_id = t.schema_id
LEFT JOIN sys.stats s ON t.object_id = s.object_id
JOIN ( SELECT object_id ,
SUM(rows) rows
FROM sys.partitions
GROUP BY object_id
HAVING SUM(rows) > 0
) pa ON t.object_id = pa.object_id
WHERE ( STATS_DATE(t.object_id, stats_id) IS NULL
OR DATEDIFF(DAY, STATS_DATE(t.object_id, stats_id), GETDATE()) >= 3
)
GROUP BY sch.name ,
t.name
ORDER BY 1


Federico Iori
Federico Iori
Valued Member
Valued Member (51 reputation)Valued Member (51 reputation)Valued Member (51 reputation)Valued Member (51 reputation)Valued Member (51 reputation)Valued Member (51 reputation)Valued Member (51 reputation)Valued Member (51 reputation)

Group: General Forum Members
Points: 51 Visits: 300
Good correction !
Alexander Suprun
Alexander Suprun
Old Hand
Old Hand (341 reputation)Old Hand (341 reputation)Old Hand (341 reputation)Old Hand (341 reputation)Old Hand (341 reputation)Old Hand (341 reputation)Old Hand (341 reputation)Old Hand (341 reputation)

Group: General Forum Members
Points: 341 Visits: 1516
Great job!

One more fix:
SELECT  REPLACE(REPLACE(sch.name + '_' + t.name, '[', '_'), ']', '_') AS [tname] ,
' update statistics ' + QUOTENAME(sch.name) + '.' + QUOTENAME(t.name) [comando]
FROM sys.tables t
INNER JOIN sys.schemas sch ON sch.schema_id = t.schema_id
LEFT JOIN sys.stats s ON t.object_id = s.object_id
JOIN ( SELECT object_id ,
SUM(rows) rows
FROM sys.partitions
GROUP BY object_id
HAVING SUM(rows) > 0
) pa ON t.object_id = pa.object_id
WHERE ( STATS_DATE(t.object_id, stats_id) IS NULL
OR DATEDIFF(DAY, STATS_DATE(t.object_id, stats_id), GETDATE()) >= 3
)
GROUP BY sch.name ,
t.name
ORDER BY 1




Alex Suprun
davoscollective
davoscollective
SSChasing Mays
SSChasing Mays (601 reputation)SSChasing Mays (601 reputation)SSChasing Mays (601 reputation)SSChasing Mays (601 reputation)SSChasing Mays (601 reputation)SSChasing Mays (601 reputation)SSChasing Mays (601 reputation)SSChasing Mays (601 reputation)

Group: General Forum Members
Points: 601 Visits: 1000
Wow, spinning up SSIS tasks on the fly to take advantage of multithreading. This is exploding my mind at the possibilities.

Thanks for this article, it's a great example to explain this technique. I like that your code considers maximum threads and error handling, I can see the biggest danger might be creating too many packages or falling into some loop of continuous package creation.

Have you recorded some performance metrics of this technique versus using the stored proc to update stats one at a time?
Federico Iori
Federico Iori
Valued Member
Valued Member (51 reputation)Valued Member (51 reputation)Valued Member (51 reputation)Valued Member (51 reputation)Valued Member (51 reputation)Valued Member (51 reputation)Valued Member (51 reputation)Valued Member (51 reputation)

Group: General Forum Members
Points: 51 Visits: 300
Thanks to Alexander for the fix.
About performance metrics :
in general, my update statistics runs N times faster than built-in sp_updatestats , where N is number of threads .
I think that N= 5/10 is good for most purposes ... anyway consider that updating statistics in this way is a task that makes heavy use of db resources, to perform his task faster than the dedicated built-in does
cmjj
cmjj
Forum Newbie
Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)

Group: General Forum Members
Points: 5 Visits: 279
Part of our current daily db maintenance is a sql script that reads through all indexes in a db and determines if they need a reorg or rebuild, or no maintenance required. Problem there is the script runs serially. We have time in our schedule to do this, but it's always great to get things running quicker.

Can anyone think of a reason you couldn't use this same technique to create a list of reorg or rebuild commands, create a sql task for each, and therefore take advantage of the parallelism demonstrated here?
Federico Iori
Federico Iori
Valued Member
Valued Member (51 reputation)Valued Member (51 reputation)Valued Member (51 reputation)Valued Member (51 reputation)Valued Member (51 reputation)Valued Member (51 reputation)Valued Member (51 reputation)Valued Member (51 reputation)

Group: General Forum Members
Points: 51 Visits: 300
chadmjordan (2/2/2013)
Part of our current daily db maintenance is a sql script that reads through all indexes in a db and determines if they need a reorg or rebuild, or no maintenance required. Problem there is the script runs serially. We have time in our schedule to do this, but it's always great to get things running quicker.

Can anyone think of a reason you couldn't use this same technique to create a list of reorg or rebuild commands, create a sql task for each, and therefore take advantage of the parallelism demonstrated here?


There should be no problem, but consider that rebuilding an index is , in general, much heavier task than update statistics, so I thin you will not be able to use a large number of threads, but you could try with 4/5 , it would speed up your script 4/5 times ...
Arjun Sivadasan
Arjun Sivadasan
SSC-Addicted
SSC-Addicted (448 reputation)SSC-Addicted (448 reputation)SSC-Addicted (448 reputation)SSC-Addicted (448 reputation)SSC-Addicted (448 reputation)SSC-Addicted (448 reputation)SSC-Addicted (448 reputation)SSC-Addicted (448 reputation)

Group: General Forum Members
Points: 448 Visits: 976
Hi Federico, I gave this a try.

We have daily tasks running in our system. Every day at a specific time, we truncate certain tables and repopulate them. We have to update statistics for these tables.

I've written a script which queries sysindexes and finds out the tables which were modified. I integrated that with the script in your article to return tname and comando. This should work very well. However, I get a runtime error. I'm not very good at SSIS. Can you please help me out?

I have attached screenshots of the error. The error says "the source code cannot be displayed". I think the parent package is not able to reference the child package. If I add the child package to the solution and try to execute the child package alone, it works alright.

And when we say
app.SaveToXml(packageName + ".dtsx", dataTransferPackage, Nothing)



does the package always get saved within the project directory? How do I deploy this to server? I want to schedule this as a job in the server.
Attachments
error1.JPG (3 views, 96.00 KB)
error2.JPG (3 views, 132.00 KB)
Federico Iori
Federico Iori
Valued Member
Valued Member (51 reputation)Valued Member (51 reputation)Valued Member (51 reputation)Valued Member (51 reputation)Valued Member (51 reputation)Valued Member (51 reputation)Valued Member (51 reputation)Valued Member (51 reputation)

Group: General Forum Members
Points: 51 Visits: 300
Arjun Sivadasan (3/27/2013)
Hi Federico, I gave this a try.

We have daily tasks running in our system. Every day at a specific time, we truncate certain tables and repopulate them. We have to update statistics for these tables.

I've written a script which queries sysindexes and finds out the tables which were modified. I integrated that with the script in your article to return tname and comando. This should work very well. However, I get a runtime error. I'm not very good at SSIS. Can you please help me out?

I have attached screenshots of the error. The error says "the source code cannot be displayed". I think the parent package is not able to reference the child package. If I add the child package to the solution and try to execute the child package alone, it works alright.

And when we say
app.SaveToXml(packageName + ".dtsx", dataTransferPackage, Nothing)



does the package always get saved within the project directory? How do I deploy this to server? I want to schedule this as a job in the server.

Arjun ,
1) please send me the TSQL code that returns tname and comando , I think the problem is there
2) app.SaveToXml saves to the same directory where is the master DTSX
3) If you want this all to run on the server, you should preferably, as I have done, build a bat file that runs the master DTSX , saved in a dedicated directory on the database server
This will not work if the master DTSX is saved into the SQlSever database , the DTSX must be saved in the filesystem
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search