Manage SQL Agent Job History

October 2, 2015

Recently in one of our server row count in MSDB.DBO.SysJobHistory table exceeded 100 Million and looking at job history was slow via SSMS. It was impossible via the GUI. Because we had about 50 jobs that were running frequently (called from a listener mostly once in every 2 to 3 minutes). I was curious if we can manage the job history for individual jobs or job categories. I also wanted different retention for success vs failure of the job. Meaning for the jobs that run every minute a few days of history for success run is enough whereas for failed ones I prefer to keep it longer.

We all know about the problem with the mechanism for job history purge built into the product and there are many data professionals blogged about it. Many people did complex calculations about how many rows will be generated based on the job running frequency. Then strike a balance between frequent and nonfrequent running jobs.

The approach I am taking is to disable the built-in feature completely and roll out a custom solution for purging the data. Where I will be able to set retention per job, per job category (the same retention for all jobs belong to a job category). Moreover as mentioned above this will enable us to set different retention for successful vs failed jobs.

AgentJobHistory

First, we look at how much data we have in our msdb.dbo.sysjobhistory table.

USE msdb;
GO

--Count of history record by job (only successful ones)
SELECT  sj.name ,
        COUNT(*) AS [count]
FROM    msdb.dbo.sysjobhistory sjh WITH ( NOLOCK )
        JOIN msdb.dbo.sysjobs sj WITH ( NOLOCK ) ON sjh.job_id = sj.job_id
WHERE   run_status IN ( 1, 2, 3 )
GROUP BY sj.name
ORDER BY 2 DESC;
GO

--Count of history record by job (only failed ones)
SELECT  sj.name ,
        COUNT(*) AS [count]
FROM    msdb.dbo.sysjobhistory sjh WITH ( NOLOCK )
        JOIN msdb.dbo.sysjobs sj WITH ( NOLOCK ) ON sjh.job_id = sj.job_id
WHERE   run_status IN ( 0 )
GROUP BY sj.name
ORDER BY 2 DESC;
GO

--Find oldest date per job (only successful ones)
SELECT  sj.name ,
        MIN(sjh.run_date) AS [Oldest_Successful_Record]
FROM    msdb.dbo.sysjobhistory sjh WITH ( NOLOCK )
        JOIN msdb.dbo.sysjobs sj WITH ( NOLOCK ) ON sjh.job_id = sj.job_id
WHERE   run_status IN ( 1, 2, 3 )
GROUP BY sj.name
ORDER BY 2 DESC;
GO

--Find oldest date per job (only failed ones)
SELECT  sj.name ,
        MIN(sjh.run_date) AS [Oldest_Failed_Record]
FROM    msdb.dbo.sysjobhistory sjh WITH ( NOLOCK )
        JOIN msdb.dbo.sysjobs sj WITH ( NOLOCK ) ON sjh.job_id = sj.job_id
WHERE   run_status IN ( 0 )
GROUP BY sj.name
ORDER BY 2 DESC;
GO

Create two views in your DBA database. One for Jobs and another one for job history.

SET ANSI_NULLS ON;
GO
SET QUOTED_IDENTIFIER ON;
GO
IF OBJECT_ID('dbo.v_sysjobs', 'V') IS NOT NULL
    DROP VIEW dbo.v_sysjobs;
GO
SET ANSI_NULLS ON;
GO

SET QUOTED_IDENTIFIER ON;
GO
CREATE VIEW [dbo].[v_sysjobs]
AS
    SELECT  CASE c.category_id
              WHEN 0 THEN NULL
              ELSE c.name
            END AS CategoryName ,
            j.job_id AS job_id ,
            j.name AS JobName
    FROM    msdb..sysjobs j
            JOIN msdb..syscategories c ON j.category_id = c.category_id;

GO

IF OBJECT_ID('dbo.v_sysjobhistory', 'V') IS NOT NULL
    DROP VIEW dbo.v_sysjobhistory;
GO

SET ANSI_NULLS ON;
GO

SET QUOTED_IDENTIFIER ON;
GO

CREATE VIEW [dbo].[v_sysjobhistory]
AS
    SELECT  instance_id ,
            job_id ,
            run_date ,
            run_time ,
            run_status
    FROM    msdb..sysjobhistory;

GO

Create a table where you will enter your retention requirements. You can make entries per job, per job category and different values for success and failed jobs.

CREATE TABLE [dbo].[DBA_sysjobhistory_config]
    (
      [JobName] [VARCHAR](256) NULL ,
      [CategoryName] [VARCHAR](256) NULL ,
      [NumDaysToRetain] [INT] NULL ,
      [NumFailureDaysToRetain] [INT] NULL
    )
ON  [PRIMARY];

Insert your default retention here. For example all successful job history for 90 days and failed ones for a year.

INSERT  INTO DBA_sysjobhistory_config
VALUES  ( '(default)', '(default)', 90, 365 );
GO

Store procedure that will do the real work of purging msdb.dbo.sysjobhistory table. It will be called by the driver proc.

IF OBJECT_ID('dbo.DBA_sp_jobhistory_row_limiter', 'P') IS NOT NULL
    DROP PROCEDURE dbo.DBA_sp_jobhistory_row_limiter;
GO
SET ANSI_NULLS ON;
GO
SET QUOTED_IDENTIFIER ON;
GO

CREATE PROCEDURE [dbo].[DBA_sp_jobhistory_row_limiter]
    @job_id UNIQUEIDENTIFIER
AS
    BEGIN
        SET DEADLOCK_PRIORITY LOW;
        DECLARE @NumDaysToRetain INT;
        DECLARE @NumFailureDaysToRetain INT;
        DECLARE @LastDayToRetain INT;
        DECLARE @LastFailureDayToRetain INT;
        DECLARE @RowsDeleted INT;

        SET NOCOUNT ON;  

 -- Get the NumDaysToRetain and NumFailureDaysToRetain
 -- FailSafe: maintain 30 days.
        SELECT
 --j.* , c_byJob.*, c_ByCat.* , c_Default.*
                @NumDaysToRetain = COALESCE(c_byJob.NumDaysToRetain,
                                            c_byCat.NumDaysToRetain,
                                            c_Default.NumDaysToRetain, 30) ,
                @NumFailureDaysToRetain = COALESCE(c_byJob.NumFailureDaysToRetain,
                                                   c_byCat.NumFailureDaysToRetain,
                                                   c_Default.NumFailureDaysToRetain,
                                                   30)
        FROM    v_sysjobs j
                LEFT JOIN DBA_sysjobhistory_config c_byJob --first we prefer to JOIN first by JobName
                ON j.JobName = c_byJob.JobName
                LEFT JOIN DBA_sysjobhistory_config c_byCat --next we prefer to JOIN first by Category
                ON j.CategoryName = c_byCat.CategoryName
                CROSS JOIN DBA_sysjobhistory_config c_Default   --last, get me the defaults.
        WHERE   j.job_id = @job_id
                AND c_Default.JobName = '(default)'
                AND c_Default.CategoryName = '(default)';

 --get the actual date of the most recent rows that we wish to maintain.
 --convert that to an INT so it works with msdb..sysjobhistory
        SELECT  @LastDayToRetain = CONVERT(INT, CONVERT(VARCHAR(200), ( GETDATE()
                                                              - @NumDaysToRetain ), 112));
        SELECT  @LastFailureDayToRetain = CONVERT(INT, CONVERT(VARCHAR(200), ( GETDATE()
                                                              - @NumFailureDaysToRetain ), 112));

 --DELETE sysjobhistory rows in a loop, keep going until we have nothing left to delete
        SELECT  @RowsDeleted = 1;
        WHILE ( @RowsDeleted <> 0 )
            BEGIN
 --handle SUCCESS case
                BEGIN TRAN;
                DELETE TOP ( 1000 )
                FROM    v_sysjobhistory WITH ( READPAST )
                WHERE   job_id = @job_id
                        AND run_status IN ( 1, 2, 3 )   --Succeeded, Retry, and Canceled.  I view all of these as successes.
                        AND run_date < @LastDayToRetain;

                SELECT  @RowsDeleted = @@ROWCOUNT;
 --PRINT 'SUCCESS rows deleted:  ' + convert(varchar(200),@RowsDeleted)
                COMMIT;

 --handle FAILURE case
                BEGIN TRAN;
                DELETE TOP ( 1000 )
                FROM    v_sysjobhistory WITH ( READPAST )
                WHERE   job_id = @job_id
                        AND run_status IN ( 0 )   --Failure
                        AND run_date < @LastFailureDayToRetain;
                COMMIT;

                SELECT  @RowsDeleted = CASE WHEN @@ROWCOUNT = 0
                                            THEN @RowsDeleted
                                            ELSE @@ROWCOUNT
                                       END;
 --PRINT 'FAILURE rows deleted (or successes from above):  ' + convert(varchar(200),@RowsDeleted)

 --catch your breath
                WAITFOR DELAY '00:00:01';
            END;

        RETURN(0); -- Success
    END;
GO

Driver store procedure which will go through your entries in DBA_sysjobhistory_config table and manage the purge.

IF OBJECT_ID('dbo.DBA_sysjobhistory_driver', 'P') IS NOT NULL
    DROP PROCEDURE dbo.DBA_sysjobhistory_driver;
GO

SET ANSI_NULLS ON;
GO
SET QUOTED_IDENTIFIER ON;
GO

CREATE PROCEDURE [dbo].[DBA_sysjobhistory_driver]
AS
    BEGIN
        SET DEADLOCK_PRIORITY LOW;
        DECLARE @job_id UNIQUEIDENTIFIER;

        SET NOCOUNT ON;  

--first, blindly set SQLAgent sysjobhistory to unlimited.  This shuts off sp_jobhistory_row_limiter
--which cuts down on waits and job blocking for larger customers.
        EXEC msdb.dbo.sp_set_sqlagent_properties @jobhistory_max_rows = -1,
            @jobhistory_max_rows_per_job = -1;

        DECLARE CURSER CURSOR
        FOR
            SELECT  job_id
            FROM    v_sysjobs;
        OPEN CURSER;
        FETCH NEXT FROM CURSER INTO @job_id;
        WHILE ( @@FETCH_STATUS = 0 )
            BEGIN

 --we must have at least 100 history rows for the job, otherwise do no purging.  This is so we don't lose history for jobs that are run infrequently (ie daylight savings)
                IF ( SELECT COUNT(*)
                     FROM   v_sysjobhistory WITH ( READPAST )
                     WHERE  job_id = @job_id
                   ) > 100
                    BEGIN
 --PRINT 'Running DBA_sp_jobhistory_row_limiter for @job_id:  ' + convert(varchar(200),@job_id)
                        EXEC dbo.DBA_sp_jobhistory_row_limiter @job_id = @job_id;
                    END;

                FETCH NEXT FROM CURSER INTO @job_id; 

            END;
        CLOSE CURSER;
        DEALLOCATE CURSER;

    END;

Create a SQL agent job that will call the driver store procedure based on the schedule you set up. If you run my code as is it will run every 6 hours. You can adjust your schedule as needed.

SE [msdb]
GO

DECLARE @jobId binary(16)

SELECT @jobId = job_id FROM msdb.dbo.sysjobs WHERE (name = N'PurgeJobHistoryTableMSDB')
IF (@jobId IS NOT NULL)
BEGIN
    EXEC msdb.dbo.sp_delete_job @jobId,@delete_unused_schedule=1
END

GO

BEGIN TRANSACTION
DECLARE @ReturnCode INT
SELECT @ReturnCode = 0

IF NOT EXISTS (SELECT name FROM msdb.dbo.syscategories WHERE name=N'[Uncategorized (Local)]' AND category_class=1)
BEGIN
EXEC @ReturnCode = msdb.dbo.sp_add_category @class=N'JOB', @type=N'LOCAL', @name=N'[Uncategorized (Local)]'
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback

END

DECLARE @jobId BINARY(16)
EXEC @ReturnCode =  msdb.dbo.sp_add_job @job_name=N'PurgeJobHistoryTableMSDB',
 @enabled=1,
 @notify_level_eventlog=0,
 @notify_level_email=0,
 @notify_level_netsend=0,
 @notify_level_page=0,
 @delete_level=0,
 @description=N'No description available.',
 @category_name=N'[Uncategorized (Local)]',
 @owner_login_name=N'sa', @job_id = @jobId OUTPUT
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback

EXEC @ReturnCode = msdb.dbo.sp_add_jobstep @job_id=@jobId, @step_name=N'Run Driver Store Procedure',
 @step_id=1,
 @cmdexec_success_code=0,
 @on_success_action=1,
 @on_success_step_id=0,
 @on_fail_action=2,
 @on_fail_step_id=0,
 @retry_attempts=0,
 @retry_interval=0,
 @os_run_priority=0, @subsystem=N'TSQL',
 @command=N'EXEC dbo.DBA_sysjobhistory_driver',
 @database_name=N'test',
 @flags=0
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
EXEC @ReturnCode = msdb.dbo.sp_update_job @job_id = @jobId, @start_step_id = 1
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
EXEC @ReturnCode = msdb.dbo.sp_add_jobschedule @job_id=@jobId, @name=N'Every 6 hour',
 @enabled=1,
 @freq_type=4,
 @freq_interval=1,
 @freq_subday_type=8,
 @freq_subday_interval=6,
 @freq_relative_interval=0,
 @freq_recurrence_factor=0,
 @active_start_date=20150908,
 @active_end_date=99991231,
 @active_start_time=0,
 @active_end_time=235959
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
EXEC @ReturnCode = msdb.dbo.sp_add_jobserver @job_id = @jobId, @server_name = N'(local)'
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
COMMIT TRANSACTION
GOTO EndSave
QuitWithRollback:
    IF (@@TRANCOUNT > 0) ROLLBACK TRANSACTION
EndSave:

Full code for you to download: PurgeHistory

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.