Search code examples
sql-servertry-catchsql-server-agent

TRY/CATCH does not work on SQL Server Agent error?


I use sp_start_job to start a job.

The job (test2) has only one step:

select getdate()
waitfor delay '00:00:10'

The TRY/CATCH code:

begin try
    EXEC msdb.dbo.sp_start_job @job_name = 'test2'
end try
begin catch
    print 'error'
end catch

First run of the code:

Job 'test2' started successfully.

Second run of the code (within 10 seconds):

Msg 22022, Level 16, State 1, Line 0
SQLServerAgent Error: Request to run job test2 (from User sa) refused because the job is already running from a request by User sa.

Why does TRY/CATCH not work in this scenario?

UPDATE: I should added at first that I am working on a sql server 2005 which has linked servers (sql server 2000). I was trying to write a proc on the sql server 2005 server to see a job on all the linked servers. If the job is not running, run it. Initially, I used try - catch and hoped to catch any error when run the already running job but failed (this thread).

I finally used following code: (it won't compile, you need to substitute some variables, just gives an idea)

    CREATE TABLE [dbo].[#jobInfo](
        [job_id] [uniqueidentifier] NULL,
        [originating_server] [nvarchar](30) ,
        [name] [nvarchar](128) ,
        [enabled] [tinyint] NULL,
        [description] [nvarchar](512) ,
        [start_step_id] [int] NULL,
        [category] [nvarchar](128) ,
        [owner] [nvarchar](128) ,
        [notify_level_eventlog] [int] NULL,
        [notify_level_email] [int] NULL,
        [notify_level_netsend] [int] NULL,
        [notify_level_page] [int] NULL,
        [notify_email_operator] [nvarchar](128) ,
        [notify_netsend_operator] [nvarchar](128) ,
        [notify_page_operator] [nvarchar](128) ,
        [delete_level] [int] NULL,
        [date_created] [datetime] NULL,
        [date_modified] [datetime] NULL,
        [version_number] [int] NULL,
        [last_run_date] [int] NOT NULL,
        [last_run_time] [int] NOT NULL,
        [last_run_outcome] [int] NOT NULL,
        [next_run_date] [int] NOT NULL,
        [next_run_time] [int] NOT NULL,
        [next_run_schedule_id] [int] NOT NULL,
        [current_execution_status] [int] NOT NULL,
        [current_execution_step] [nvarchar](128) ,
        [current_retry_attempt] [int] NOT NULL,
        [has_step] [int] NULL,
        [has_schedule] [int] NULL,
        [has_target] [int] NULL,
        [type] [int] NOT NULL
    )


    SET @sql = 
    'INSERT INTO #jobInfo
    SELECT * FROM OPENQUERY( [' + @srvName + '],''set fmtonly off exec msdb.dbo.sp_help_job'')'

    EXEC(@sql)

    IF EXISTS (select * from #jobInfo WHERE [name] = @jobName AND current_execution_status IN (4,5)) -- 4: idle, 5: suspended 
    BEGIN
        SET @sql = 'EXEC [' + @srvName + '].msdb.dbo.sp_start_job @job_name = ''' + @jobName + ''''
        --print @sql    
        EXEC (@sql) 
        INSERT INTO #result (srvName ,status ) VALUES (@srvName, 'Job started.')
    END ELSE BEGIN
        INSERT INTO #result (srvName ,status ) VALUES (@srvName, 'Job is running already. No action taken.')
    END

Solution

  • Not all errors can be caught by TRY/CATCH. In this case, sp_start_job actually calls external procedures, and these are outside the bounds of SQL Server's error handling. Or at least that's the story that they're sticking to:

    http://connect.microsoft.com/SQLServer/feedback/details/362112/sp-start-job-error-handling

    Also note that this is still a problem in SQL Server 2012 SP1 CU3. Please vote and comment if you want this bug fixed.

    A tedious but viable workaround, which requires certain permissions and in this case assumes the job owner is sa:

    DECLARE @x TABLE
    (
      a VARBINARY(32),b INT,c INT,d INT,e INT,f INT,g INT,h INT,i NVARCHAR(64),
      Running BIT, -- the only important column
      k INT,l INT,m INT
    );
    
    DECLARE @job_id UNIQUEIDENTIFIER;
    
    SELECT @job_id = job_id FROM msdb.dbo.sysjobs WHERE name = N'test2';
    
    INSERT @x EXEC master.dbo.xp_sqlagent_enum_jobs 1, N'sa', @job_id;
    
    IF EXISTS (SELECT 1 FROM @x WHERE Running = 0)
    BEGIN
         EXEC msdb.dbo.sp_start_job @job_name = N'test2';
    END
    ELSE
    BEGIN
         PRINT 'error';
    END
    

    Even better might be:

    DECLARE @job_id UNIQUEIDENTIFIER, @d DATETIME;
    
    SELECT @job_id = job_id FROM msdb.dbo.sysjobs WHERE name = N'test2';
    
    SELECT @d = stop_execution_date 
      FROM msdb.dbo.sysjobactivity WHERE job_id = @job_id;
    
    IF @d IS NOT NULL
    BEGIN
         EXEC msdb.dbo.sp_start_job @job_name = N'test2';
    END
    ELSE
    BEGIN
         PRINT 'error';
    END
    

    In either case, it is still possible that the job has started between the check for its status and the call to start it, so this doesn't eliminate errors from sp_start_job altogether, but it makes them far less likely to occur.