I apologize in advance for this being vague, but it would take some time to include real code - there is a TON of it and some things would need to be masked out. So I am going to post this as a very general "throw out some ideas" kind of question.
We have a nightly process in our Great Plains system that basically takes a snapshot of the current historic aged trial balance and dumps the results into a table. It usually runs in under a minute, but recently we've had some failures due to the job hanging and running for hours - we typically end up killing it after 24 hours.
Here is the weird part... If I run the exact same code again, it often finishes again in under a minute. Which, okay... it could be cached or whatever; but if we tweak the date parameters, we can get it to alternate between running normally (under a minute) to hanging - we generally kill it after 10 minutes.
Every time it hangs, it's hanging on an update statement against temp tables created by the stored proc, with a wait type of CXCONSUMER.
Our first guess was some kind of bad data... but that doesn't seem to match because it should always fail. Our next guess was some other process clashing with it... but the thing is, I can replicate the problem on our reporting server, which is an hourly log-shipped copy of production. There are no processes on the reporting server, and the databases are read-only.
My guess - based on the CXCONSUMER - is that it's resources. But I can't pinpoint anything that would be causing the issue on two different boxes. Both servers are pretty robust in terms of CPU, RAM, etc.
The stored procedure has parameters to allow us to provide a range of customer names - the nightly job leaves these blank, which means do all. We can run "D" through "Z" without an issue. If we try "0" (zero) through "D" it hangs. But we can do "0" - "B", "C"-"D" and it works.
Again, I realize this is all very vague... sorry about that...