• The step above would also be the approach I take. As mentioned by other users ensure that if it does stop it has no impact to anything within the system and can be re-runnable at a later time with no issues.

    As you said in your comment its anywhere from 1 to 14 hours... that's a very big difference time wise. Without knowing why its 1 to 14 hours. I would try to get to the root cause why it takes so long and if it can be optimised.

    If it can't be optimised and you have no choice , hopefully your job is in steps so you can restart at the step it failed instead of repeating the whole job from scratch.