job schedule doesn't work well

  • Hi, guys:

    I am re-posting my problem since it's not solved.

    The problem was: My job schedule does not work well. For example, I have a 2 steps job, step 1 run a DTS; step 2 run a script code. I schedule it to run once an hour. Most time it works fine. But sometime after stopping it never start again, sometimes it stuck at step 2 and never stop. In both cases there no error message generated. When I check the job status and history, both succeeded, in the second case, the job status is "Executing job Step 2".

    Any help is very much appreciated.

  • This was removed by the editor as SPAM

  • You've looked at the job logs, but have you checked the Windows Event Logs? Sometimes they give additional error information, especially if the problem is caused by the OS and not SQL Server.

    -SQLBill

  • Lixin,

    I've had a look at your previous post. Unfortunately I can't see exactly what the Vineftp SP does, but it may be prone to any number of ftp problems (ftp site temporarily unavailable, password expired, internet/LAN connection glitch, etc). In that case the job step may hang and never recover... thus you get the "Executing Job Step 2" message.

    As a preventative measure you may want to develop a more robust ftp method (maybe even using a freeware package) that gracefully exits with an error code/msg if a problem is encountered, rather than hanging there waiting for a response as vanilla ftp would.

    To assist diagnosis, put some PRINT (or RAISERROR ... NOWAIT) statements in the Vineftp SP, and use the Advanced tab of the Edit Job Step dialog to send the job step output to a file. This may or may not capture some helpful info.

    In the meantime, to automatically detect these situations you may want to schedule another script that runs say 10 minutes before the scheduled start. The script would have to be able to query the execution state of jobs, so look at SPs msdb..sp_get_composite_job_info and/or msdb..sp_help_job for clues on how to do this. Your script could even issue an sp_stop_job if the job step is still executing, but this may not have any effect if it's waiting on a host command. If you currently use EM "Stop Job" to recover from these situations then theoretically sp_stop_job will work.


    Cheers,
    - Mark

  • Thank you mccork,

    Could you give me an example script for detecting job execution? We do have the ftp site temporarily unavailable, network down problems sometimes, the problem is after them all recovered the job is still hanging there, even the SQL agent is running. My Merge Agents for replication database have the same problem. After success stop sometimes (at least from the job log it says success), it never start again and you never know when it would happen.

    I will try what you suggested.

    Thanks again.

    Lixin

    Edited by - lxwang on 06/05/2003 07:50:08 AM

  • Lixin,

    Try the following script (blatantly hijacked from MS code). Schedule it to run maybe 10 mins before each scheduled time of your problem job.

     
    
    SET NOCOUNT ON
    DECLARE @JobName SYSNAME
    SET @JobName = 'Insert your job name here'

    CREATE TABLE #xp_results (
    job_id UNIQUEIDENTIFIER NOT NULL,
    last_run_date INT NOT NULL,
    last_run_time INT NOT NULL,
    next_run_date INT NOT NULL,
    next_run_time INT NOT NULL,
    next_run_schedule_id INT NOT NULL,
    requested_to_run INT NOT NULL,
    request_source INT NOT NULL,
    request_source_id sysname COLLATE database_default NULL,
    running INT NOT NULL,
    current_step INT NOT NULL,
    current_retry_attempt INT NOT NULL,
    job_state INT NOT NULL)

    INSERT INTO #xp_results
    EXECUTE master.dbo.xp_sqlagent_enum_jobs 1, '', NULL

    IF (SELECT xpr.job_state FROM #xp_results xpr, msdb.dbo.sysjobs_view sjv
    WHERE sjv.job_id = xpr.job_id
    AND sjv.name = @JobName) = 1 -- active
    BEGIN
    RAISERROR('Job "%s" still active. Attempting to stop it....', 0, 1, @JobName) WITH LOG
    EXEC msdb..sp_stop_job @job_name=@JobName
    END
    ELSE
    PRINT 'Job "' + @JobName + '" is not executing. No action necessary'

    DROP TABLE #xp_results


    Cheers,
    - Mark

  • Hi, mccork,

    I got the script run on my server, but I can’t figure out how to check the result. Where is the Log file?

    Do I need to save it to somewhere (e.g. table or file)?

    Please show me.

    Thanks a lot.

  • The "RAISERROR ... WITH LOG" will write to the SQL Error Log. This is viewable through QA with sp_readerrorlog or through EM under "Management / SQL Server Logs / Current".

    Additionally, if you schedule this process, either the RAISERROR or the PRINT result will be viewable from the "View Job History / Show Step Details" dialog in EM.


    Cheers,
    - Mark

  • Thanks mccork,

    I do find the result from ¡°View Job History¡±. It says: Job "MyJob" is not executing. No action necessary [SQLSTATE 01000] (Message 0). The step succeeded. in one step. No ¡°Error Log¡± since the job was success.

    I scheduled it run at 11:50 and 12:50, ¡°MyJob¡± was run at 12:00. it suppose to run at 1:00 and 2:00 ¡­ , but it is still hanging there. When I check View Job History, it says success.

    Still no clue what was wrong.

    Any suggestion?

  • If your original (FTP) job is still executing, can it be stopped with a simple "stop job" through EM?


    Cheers,
    - Mark

  • Yes,but hard to restart. It says the job is executing by another schedule. So I had to stop the SQL agent and restart it.

  • The "executing by another schedule" message suggests that the STOP JOB instruction was not effective, which is what I feared because your host command (FTP activity) is hanging - most probably waiting on a response that cannot be supplied.

    In that case my earlier suggestion of a more robust ftp method would be your best course of action.


    Cheers,
    - Mark

  • Do you have any example for such robust script method?

    It is pretty strange. I started a job this morning at 6:00, it run another time at 7:00, then stopped and not start any more. I looked the job history, every step was success, no hanging executing steps, looks like the schedule doesn¡¯t work.

    We have 6 remote servers in different cities, only 1 has the problem frequently, no obvious difference of those server settings. Is there any server settings could affect the SQL agent?

    Thanks

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply