SQL Agent skipping job execution

  • I am experiencing an issue with a couple of SQL Agent jobs on SQL Server 2008 R2. The job will execute and you can see it in the history, but the run time is typically less than 10 seconds and the code itself did not execute. Actually, some of the code is executing, but not all of it. Strange as hell. And it is infrequent. Some days it works, some days it doesn't.

    Mr. Google shows me that other people have experienced this issue going back to SQL2005, however, nobody seems to have a solution.

    These are things it is not:

    - It is not a security issue. The job is running as system administrator

    - It is not due to other jobs running at the same time.

    - It is not due to SQL Agent or even SQL Server needing to be restarted or the server being rebooted (all of that has been done).

    - It is not specific to an edition of SQL. I'm seeing the problem in Enterprise and Standard.

    Here is one of the jobs that is showing this behavior:

    DECLARE @Threads INT = 5

    DECLARE @ReorgThreshold INT = 30

    DECLARE @Rows BIGINT = 10000

    DECLARE @ByPassGetStats BIT = 0

    DECLARE @ByPassDefrag BIT = 0

    DECLARE @DatabaseID INT;

    DECLARE @DatabaseName VARCHAR(255);

    DECLARE @TableName VARCHAR(255);

    DECLARE @SchemaName VARCHAR(255);

    DECLARE @ThreadCounter INT;

    DECLARE @SQL NVARCHAR(4000);

    CREATETABLE #IndexFrag (

    DatabaseName VARCHAR(255) NULL,

    ObjectID INT,

    IndexID INT,

    PartitionNumber INT,

    FragmentationPerc FLOAT,

    Pages INT,

    Records BIGINT,

    IndexName VARCHAR(255),

    SchemaName VARCHAR(255),

    TableName VARCHAR(255),

    AllowPageLocks INT);

    CREATE TABLE #TableList (

    DatabaseName VARCHAR(255) NULL,

    SchemaName VARCHAR(255) NULL,

    TableName VARCHAR(255) NULL,

    Records BIGINT );

    IF @ByPassGetStats = 0

    BEGIN

    --Get the index fragmentation

    DECLARE ReorgIndexCursor01 CURSOR FOR

    SELECT[name], database_id

    FROMmaster.sys.databases

    WHERE[name] NOT IN ('master', 'model', 'tempdb')

    ORDERBY [name];

    OPENReorgIndexCursor01;

    FETCHNEXT FROM ReorgIndexCursor01 INTO @DatabaseName, @DatabaseID;

    WHILE @@FETCH_STATUS = 0

    BEGIN

    INSERTINTO #IndexFrag (DatabaseName, ObjectID, IndexID, PartitionNumber, FragmentationPerc, Pages, Records)

    SELECT@DatabaseName,

    ps.OBJECT_ID,

    ps.index_id,

    ps.partition_number,

    ps.avg_fragmentation_in_percent,

    ps.page_count,

    ps.record_count

    FROMmaster.sys.dm_db_index_physical_stats (@DatabaseID, NULL, NULL , NULL, N'SAMPLED') ps

    WHEREps.index_id > 0

    OPTION (MaxDop 1);

    --Update the table with the schema, table, and index names

    SELECT@SQL = 'USE [' + @DatabaseName + ']

    UPDATE#IndexFrag

    SETIndexName = i.name,

    SchemaName = s.name,

    TableName = o.name,

    AllowPageLocks = i.allow_page_locks

    FROM#IndexFrag ti

    INNER JOIN sys.objects o ON ti.ObjectID = o.object_id

    INNER JOIN sys.schemas s ON o.schema_id = s.schema_id

    INNER JOIN sys.indexes i ON o.object_id = i.object_id

    WHEREti.DatabaseName = ' + CHAR(39) + @DatabaseName + CHAR(39) + '

    AND i.index_id = ti.IndexID ';

    EXEC (@SQL);

    FETCHNEXT FROM ReorgIndexCursor01 INTO @DatabaseName, @DatabaseID;

    END

    CLOSE ReorgIndexCursor01;

    DEALLOCATE ReorgIndexCursor01;

    --Update the PNGCORP_IndexList table

    UPDATEmsdb.dbo.PNGCORP_IndexList

    SETFragmentationPerc = f.FragmentationPerc,

    Pages = f.Pages,

    Records = f.Records,

    LastChecked = GETDATE()

    FROM#IndexFrag f

    INNER JOIN msdb.dbo.PNGCORP_IndexList il ON il.DatabaseName = f.DatabaseName

    AND il.ObjectID = f.ObjectID

    AND il.IndexID = f.IndexID

    AND il.PartitionNumber = f.PartitionNumber;

    --Insert new indexes into the PNGCORP_IndexList

    INSERTINTO msdb.dbo.PNGCORP_IndexList (DatabaseName, ObjectID, IndexID, PartitionNumber, FragmentationPerc, Pages, Records, IndexName,

    SchemaName, TableName, AllowPageLocks, LastChecked)

    SELECTDatabaseName, ObjectID, IndexID, PartitionNumber, FragmentationPerc, Pages, Records, IndexName,

    SchemaName, TableName, AllowPageLocks, GETDATE()

    FROM#IndexFrag f

    WHERE(SELECTCOUNT(*)

    FROMmsdb.dbo.PNGCORP_IndexList il

    WHEREil.DatabaseName = f.DatabaseName

    AND il.ObjectID = f.ObjectID

    AND il.IndexID = f.IndexID

    AND il.PartitionNumber = f.PartitionNumber) = 0;

    END

    --Get the tables we need to reindex

    INSERTINTO #TableList (DatabaseName, SchemaName, TableName, Records)

    SELECTDatabaseName, SchemaName, TableName, MAX(Records)

    FROMmsdb.dbo.PNGCORP_IndexList

    WHEREFragmentationPerc >= @ReorgThreshold

    AND Records >= @Rows

    GROUPBY DatabaseName, SchemaName, TableName

    ORDERBY MAX(Records) DESC

    --Cycle through the problem indexes and insert them into the PNGCORP_IndexReorg# tables.

    SET@ThreadCounter = 1;

    IF @Threads > 5 SET @Threads = 5;

    TRUNCATE TABLE msdb.dbo.PNGCORP_IndexReorg1;

    TRUNCATE TABLE msdb.dbo.PNGCORP_IndexReorg2;

    TRUNCATE TABLE msdb.dbo.PNGCORP_IndexReorg3;

    TRUNCATE TABLE msdb.dbo.PNGCORP_IndexReorg4;

    TRUNCATE TABLE msdb.dbo.PNGCORP_IndexReorg5;

    DECLARE ReorgIndexCursor02 CURSOR FOR

    SELECTDatabaseName, SchemaName, TableName

    FROM#TableList

    ORDERBY Records DESC;

    OPENReorgIndexCursor02;

    FETCHNEXT FROM ReorgIndexCursor02 INTO @DatabaseName, @SchemaName, @TableName;

    WHILE @@FETCH_STATUS = 0

    BEGIN

    SET @SQL = '

    INSERTINTO msdb.dbo.PNGCORP_IndexReorg' + CAST(@ThreadCounter AS VARCHAR(1)) + ' (DatabaseName, SchemaName, TableName, IndexName, AllowPageLocks)

    SELECTDISTINCT i.DatabaseName, i.SchemaName, i.TableName, i.IndexName, i.AllowPageLocks

    FROMmsdb.dbo.PNGCORP_IndexList i

    INNER JOIN #TableList t ON t.DatabaseName = i.DatabaseName

    AND t.SchemaName = i.SchemaName

    AND t.TableName = i.TableName

    WHEREi.DatabaseName = ''' + @DatabaseName + '''

    AND i.SchemaName = ''' + @SchemaName + '''

    AND i.TableName = ''' + @TableName + '''

    AND i.FragmentationPerc >= ' + CAST(@ReorgThreshold AS VARCHAR(25)) + '

    AND i.Records >= ' + CAST(@Rows AS VARCHAR(25)) + '; ';

    EXEC (@SQL);

    SET @ThreadCounter = @ThreadCounter + 1;

    IF @ThreadCounter > @Threads SET @ThreadCounter = 1;

    FETCHNEXT FROM ReorgIndexCursor02 INTO @DatabaseName, @SchemaName, @TableName;

    END

    CLOSE ReorgIndexCursor02;

    DEALLOCATE ReorgIndexCursor02;

    DROP TABLE #TableList;

    DROP TABLE #IndexFrag;

    --Start the index jobs

    IF @ByPassDefrag = 0

    BEGIN

    EXEC msdb.dbo.sp_start_job @Job_Name = 'Database Maintenance.IndexReorg1';

    IF @Threads >= 2 EXEC msdb.dbo.sp_start_job @Job_Name = 'Database Maintenance.IndexReorg2';

    IF @Threads >= 3 EXEC msdb.dbo.sp_start_job @Job_Name = 'Database Maintenance.IndexReorg3';

    IF @Threads >= 4 EXEC msdb.dbo.sp_start_job @Job_Name = 'Database Maintenance.IndexReorg4';

    IF @Threads = 5 EXEC msdb.dbo.sp_start_job @Job_Name = 'Database Maintenance.IndexReorg5';

    END

    What we are doing here is finding all of the indexes with a fragmentation > 30% and putting them into 5 different tables. Then we kick off five other jobs to perform the index maintenance.

    On some days, this jobs executes perfectly. It normally takes about an hour to run (there are some large databases on this server). On other days, it takes less than 10 seconds to execute, the code that updates the index list isn't executed. But what is really strange is the last couple of lines of code are ALWAYS executed and the five "IndexReorg" jobs are kicked off.

    I have the same issue with a db check job. Some days it executes, some days it doesn't. SQL Agent always reports that the job executed successfully, but you could tell by the runtime that nothing was done.

    SET NOCOUNT ON

    DECLARE @SQL NVARCHAR(MAX)

    DECLARE @DatabaseName VARCHAR(255)

    DECLARE CheckDatabaseIntegrityCursor CURSOR FOR

    SELECT[name]

    FROMmaster.sys.databases

    WHERE[name] NOT IN ('model', 'tempdb')

    ORDERBY [name]

    OPENCheckDatabaseIntegrityCursor

    FETCHNEXT FROM CheckDatabaseIntegrityCursor INTO @DatabaseName

    WHILE @@FETCH_STATUS = 0

    BEGIN

    PRINT @DatabaseName + ' ---------------------------------------------------------------'

    SELECT@SQL ='USE [' + @DatabaseName + ']

    DBCC CHECKDB(N' + CHAR(39) + @DatabaseName + CHAR(39) + ')

    '

    PRINT @SQL

    EXEC master.dbo.sp_executesql @SQL

    FETCHNEXT FROM CheckDatabaseIntegrityCursor INTO @DatabaseName

    END

    CLOSE CheckDatabaseIntegrityCursor

    DEALLOCATE CheckDatabaseIntegrityCursor

    I've seen a few posts on various forums where folks have reporting such issues, but nobody has any solution.

    Anyone else see anything like this?

    ...

  • Have you considered setting up a history table for the reorg/reindex process so you can review those stats on a daily basis to see if any of the indexes actually met the criteria for reorg/rebuild?

  • Yes - that is actually what I have (msdb.dbo.PNGCORP_IndexList) . That's how I can tell that the job isn't working on some days. The table has a LastChecked and LastReorg date/time column. I view the report daily and can see the LastChecked isn't always updated.

    ...

  • Just trying to find something that fits the symptoms...

    Is it possible that a db was deleted (offlined???) during your processing?! Perhaps a process that deletes and then restores a db(s)?!

    Specifically, when you check @@FETCH_STATUS, you only check for "= 0", meaning that a "-2" would also immediately end all processing for that cursor.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • That would make sense, but no, none of the databases were deleted.

    ...

  • Not what I meant by history. I meant a history of the fragmentation on the indexes. My first thought is that nothing happened on certain days because nothing met the criteria to be processed. If all the indexes are less than 30% fragmented, then nothing happens.

  • On the second job, do you see the database name print output in the job output (file or step history)?

  • The "LastChecked" column should be updated regardless of the fragmentation percent. The procedure gets the frag percent for all indexes, updates the _IndexList table (with the LastChecked column), then only looks for those indexes that are fragmented more than 30% from the _IndexList table.

    The table was never getting updated.

    Never looked at the job output - wasn't writing anything to file.

    What I did was create a master server to run the job from. Seems to be running great now. I have no idea why - the job is still physically created and executed from the local server. Maybe it was just a matter of deleting and recreating the job. Don't know. The strange thing is that it was occurring on multiple servers on different days.

    ...

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply