SSIS Query Timeout after hours of 100% cpu usage

Question

SSIS Query Timeout after hours of 100% cpu usage

King of Illiterature

SSC-Addicted

Points: 434
More actions
October 13, 2009 at 2:05 pm

#136675

I have sql 2005 SSIS package set up on local quad 2.4Ghz 4Gb Ram machine
and live on remote Pentium 4 2.8Ghz 3.8Gb Ram machine
the query inside SSIS is sql server intensive and it can took up to 3 hours (just for the sql stored procedure part) to finish when processing large amount of data.
the package is scheduled to run every 4 hours.
on our local setup when the package is running the sql database is slow down for other app but we can still login to the machine and see around 30% cpu usage
however for our remote setup when the package run the cpu usage is up to 100% and that make any other operation on the server impossible, took forever to login and can't open Management Studio to cancel the scheduled SSIS job. the job failed with query timeout after 4hours. even when processing only 1 hours worth of data. this used to work ok with test feed and smaller live feed. please note that there're other scheduled job that access the sql server to extract some data which scheduled for every 3 hours. i suspect that when this script ran in concurrent with my SSIS package it cause a lockout which then make my package failed.
but sometime i will get the
"Transaction (Process ID 51) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction."
if it's a lock issue.
but i got the "Query Timeout Expires" message way more often
any thought?

Viewing 7 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply

Tim Mitchell SSCoach Points: 15696 More actions · Answer 1

When you run the package locally, are you accessing a different database than when you try to execute it in your remote environment?

Tim Mitchell
TimMitchell.net | @tmitch.net | Tyleris.com
ETL Best Practices

King of Illiterature SSC-Addicted Points: 434 More actions · Answer 2

yes each setup access its own local sql database.

i backup the database from remote server and restore on our local setup.

Tim Mitchell SSCoach Points: 15696 More actions · Answer 3

If your remote environment is the live environment, there may be an issue of resource contention. If you have a lot of users accessing the system while your SSIS package runs, you could end up in a situation where excessive blocking or even deadlocking will impact your package. Assuming the local environment is your dev or testing environment, you may not see the same issues because fewer people are using the system at any given time.

If you can provide details about what this package is doing, I'll be glad to offer some input on how to overcome this challenge.

hth,

Tim

Tim Mitchell
TimMitchell.net | @tmitch.net | Tyleris.com
ETL Best Practices

King of Illiterature SSC-Addicted Points: 434 More actions · Answer 4

well basically my package is importing xml file using sqlxmlbulkload to the staging table in the database. then the next step is transforming the data in staging table to the final table on the same database. so that transformation involve lot's of join, insert, update, indexing.

on the live remote server there's other scheduled process that done by other programmer and i'm not sure what they're doing, i try to go through as many of their sql script as i could to put in the NOLOCK for their select statement.

i split my sql stored procedure to a smaller chunk to make sure that it's not the issue while it do a compile planning.

Tim Mitchell SSCoach Points: 15696 More actions · Answer 5

If your package runs for several hours even in your local development environment, I assume that there's a lot of transformation or a huge number of records being affected. There are a couple of things you might try.

First of all, if you have several different source/transform/destinations, consider further breaking those apart into sequential data flow tasks. This could cause your package to actually take longer to run, but you also reduce the risk of excessive blocking or deadlocking (and possible failure) of your package.

Also, if you're not already doing it, consider adding indexes to your staging tables. I see a lot of use of staging tables where no indexes are created, and in some cases this is fine, but if you have a large number of records, it can be worth the processing time to create one or more indexes on the staging table.

Further, if you're using SQL Server 2008 on that machine, consider using the Cache Transformation and Cache Data Source. If you reuse a particular subset of data repeately, these objects will allow you to store the information in cache rather than performing multiple trips to the database for the same information.

hth,

Tim

Tim Mitchell
TimMitchell.net | @tmitch.net | Tyleris.com
ETL Best Practices

King of Illiterature SSC-Addicted Points: 434 More actions · Answer 6

Thanks for the reply Tim,

i'm using sql 2005 server. i'm already breaking down the main transformation task to several subtasks. i could go down further but i dont think it will help much. i saw that it hung everytime on either update or insert of Sales table (our largest table) usually it deals with 40k new rows daily but now i split it up so that each run will only process around 6,000 rows.

i'll try adding index to the staging table like you suggest.