Click here to monitor SSC
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Data Warehouse Update to Live Table


Data Warehouse Update to Live Table

Author
Message
jvanderberg
jvanderberg
SSC-Addicted
SSC-Addicted (422 reputation)SSC-Addicted (422 reputation)SSC-Addicted (422 reputation)SSC-Addicted (422 reputation)SSC-Addicted (422 reputation)SSC-Addicted (422 reputation)SSC-Addicted (422 reputation)SSC-Addicted (422 reputation)

Group: General Forum Members
Points: 422 Visits: 746
We have a small data warehouse that updates hourly. I've been having some issues lately where the process that updates the table is intermittently stalling out and never finishing. I've narrowed it down to the step that deletes updated records, just before reinserting the new ones. Under normal circumstances, this part of the process takes less than a second.

As best as I can tell, the table is locked and not allowing the delete to occur. It's puzzling, however, that the lock never seems to resolve. Although the queries against this table normally take a few seconds, the update process will sit for hours without finishing. Eventually, it starts to slow down the entire database until users start getting time outs from select statements on unrelated databases. It basically cripples the system until I stop the job, then the system will immediately resume normal operation. I am certain that this job is causing the entire database to slow down -- the correlation is direct.

So I'm trying to come up with some way around this. Obviously, I can't have a job that randomly slows down the entire database.

If it helps, the exact query I'm running is this:

DELETE FROM fact.WorkOrder
WHERE WOStatus NOT IN ('C', 'X')
OR CreateDate >= @startdate
OR CloseDate >= @startdate

Which deletes all open work orders or work orders that have been touched in the last month. Immediately after this, the previously staged data is quickly pumped into the table.

What am I missing here? What are the preferred ways to update a actively used table? Any ideas why the lock isn't being released, even after an extended period of time?
jvanderberg
jvanderberg
SSC-Addicted
SSC-Addicted (422 reputation)SSC-Addicted (422 reputation)SSC-Addicted (422 reputation)SSC-Addicted (422 reputation)SSC-Addicted (422 reputation)SSC-Addicted (422 reputation)SSC-Addicted (422 reputation)SSC-Addicted (422 reputation)

Group: General Forum Members
Points: 422 Visits: 746
This is a critical issue that remains unresolved, therefore I must bump.
ChrisM@Work
ChrisM@Work
SSCrazy Eights
SSCrazy Eights (9K reputation)SSCrazy Eights (9K reputation)SSCrazy Eights (9K reputation)SSCrazy Eights (9K reputation)SSCrazy Eights (9K reputation)SSCrazy Eights (9K reputation)SSCrazy Eights (9K reputation)SSCrazy Eights (9K reputation)

Group: General Forum Members
Points: 8957 Visits: 19016
Have you looked at the Actual Execution Plan for the statement? It may also be worth examining the equivalent SELECT statement during a busy period. Run this, substituting your real PK, and post the actual plan;
SELECT 'YourPK', WOStatus, CreateDate, CloseDate 
FROM fact.WorkOrder
WHERE WOStatus NOT IN ('C', 'X')
OR CreateDate >= @startdate
OR CloseDate >= @startdate


I don't have a good explanation for your findings; however, checking the performance of the statement would be my first port of call. The quicker it runs, the less time locks will be held.

“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
Exploring Recursive CTEs by Example Dwain Camps
jvanderberg
jvanderberg
SSC-Addicted
SSC-Addicted (422 reputation)SSC-Addicted (422 reputation)SSC-Addicted (422 reputation)SSC-Addicted (422 reputation)SSC-Addicted (422 reputation)SSC-Addicted (422 reputation)SSC-Addicted (422 reputation)SSC-Addicted (422 reputation)

Group: General Forum Members
Points: 422 Visits: 746
Well, your suggestion made me quickly realize that I was missing two indexes. The field "CloseDate" was used often in queries, and was already indexed. The other two fields were not indexed.

Despite adding these two indexes, I am still having problems. I ran the SELECT query during peak hours, and it took 8 minutes. The execution plan was real simple. It spent 0% on the Select, 6% on Parallelism, and 94% on a clustered index scan. The clustered index is an artificial key (an identity field, because the true primary key is actually a 5-field complex key of all VARCHARs) that isn't used in the query at all. So it seems like this clustered index scan is the equivalent of a full table scan. The query selected 27,000 rows out of 55 million. Obviously a full table scan of 55 million rows to find 27,000 would be monstrously inefficient.

So it appears that I have some structural issues. I don't when the table reached 55 million rows, I don't remember it being that large. I think the table is growing faster than I expected it would, and I'm having scaling issues. I could use a few suggestions.

I don't think a composite index would help much. I think the greater than comparisons and the OR would get in the way of it using the index efficiently. (It's worth mentioning that the date fields are actually integers, packed in YYYYMMDD format.) Obviously it's decided not to use the individual indexes for some reason.

I'm not 100% sure where to go from there. I could possibly remove the CreateDate filter (which is principally there to prevent updating really old, open orders which are most likely erroneous). The principal things I'm looking for are open orders and orders closed since a specific date. Removing that part of the WHERE statement, however, had no noticeable impact on the query. It still continues to use the clustered index scan.

The clustered key is based on the WorkOrderID, which largely puts the rows in order by CreateDate. Since the clustered key is never directly used in queries, perhaps I could make it a non-clustered primary key and use CloseDate as the clustered key? It might speed up some queries, too, since CloseDate is used often in queries. I won't be able to do make this kind of change until after hours, however.

Thanks for the help and suggestion. Do you have any further ideas or thoughts?
Sean Pearce
Sean Pearce
Ten Centuries
Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)

Group: General Forum Members
Points: 1170 Visits: 3432
Perhaps splitting it up might help

DELETE FROM fact.WorkOrder
WHERE WOStatus NOT IN ('C', 'X');

DELETE FROM fact.WorkOrder
WHERE CreateDate >= @startdate;

DELETE FROM fact.WorkOrder
WHERE CloseDate >= @startdate;





The SQL Guy @ blogspot

@SeanPearceSQL

About Me
jvanderberg
jvanderberg
SSC-Addicted
SSC-Addicted (422 reputation)SSC-Addicted (422 reputation)SSC-Addicted (422 reputation)SSC-Addicted (422 reputation)SSC-Addicted (422 reputation)SSC-Addicted (422 reputation)SSC-Addicted (422 reputation)SSC-Addicted (422 reputation)

Group: General Forum Members
Points: 422 Visits: 746
Wow, talk about not seeing the forest for the trees. All three queries finished instantly, each using a index seek. I've altered the DELETE statements, and I expect it will have a marked impact on performance.

Thank you so much for the help. So glad I could get another perspective on this.
jvanderberg
jvanderberg
SSC-Addicted
SSC-Addicted (422 reputation)SSC-Addicted (422 reputation)SSC-Addicted (422 reputation)SSC-Addicted (422 reputation)SSC-Addicted (422 reputation)SSC-Addicted (422 reputation)SSC-Addicted (422 reputation)SSC-Addicted (422 reputation)

Group: General Forum Members
Points: 422 Visits: 746
...Or not.

The SELECT statements finish in a fraction of a second, but the DELETE statements continue to stall out.
Sean Pearce
Sean Pearce
Ten Centuries
Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)

Group: General Forum Members
Points: 1170 Visits: 3432
1. How many indexes on fact.WorkOrder?
2. How many Foreign Keys on fact.WorkOrder?



The SQL Guy @ blogspot

@SeanPearceSQL

About Me
jvanderberg
jvanderberg
SSC-Addicted
SSC-Addicted (422 reputation)SSC-Addicted (422 reputation)SSC-Addicted (422 reputation)SSC-Addicted (422 reputation)SSC-Addicted (422 reputation)SSC-Addicted (422 reputation)SSC-Addicted (422 reputation)SSC-Addicted (422 reputation)

Group: General Forum Members
Points: 422 Visits: 746
8 indexes including the primary key. The data space is 17.5G, and the index space is 7G.

There are no foreign keys.

Some additional information:

It stalled out while there was no other activity against that database. The activity monitor shows only sleeping sessions from my open SSMS. Nothing else was running against that particular database.

The wait type is listed as PAGEIOLATCH_EX and PAGEIOLATCH_SH. The wait begins immediately when I start running it.

It slows down the entire server as soon as I start running it. Things like the Activity Monitor and Job Activity Monitor take noticeably longer to refresh. Users get timeouts inside our ERP system.

It appears that the statement "DELETE FROM fact.WorkOrder WHERE WOStatus NOT IN ('C', 'X')" is completing immediately, but the statement "DELETE FROM fact.WorkOrder WHERE CreateDate >= @startdate" took 8 minutes to complete.
Sean Pearce
Sean Pearce
Ten Centuries
Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)

Group: General Forum Members
Points: 1170 Visits: 3432
Can we see the actual execution plan for
DELETE FROM fact.WorkOrder WHERE CreateDate >= @startdate





The SQL Guy @ blogspot

@SeanPearceSQL

About Me
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search