Search code examples
laravelamazon-sqslaravel-queue

Laravel-SQS fifo queue - jobs process fine but stay in flight and fail


To limit the amount of jobs coming from my queue, I introduced some code in my PHP job file. After the job is pushed I sleep for some time:

// random nr between 3 and 4 min
$r = rand(180, 240);
sleep($r);

The queue that I am using is a SQS FIFO queue and jobs that are submitted go there just fine. My worker is only using one process and tries 3 times:

more aws-worker.conf
command=php /var/www/html/website/artisan queue:work sqs_aws --sleep=5 --tries=3
autostart=true
autorestart=true
user=root
numprocs=1

However, when I submit 2 jobs, the queue worker releases them approx after 1 min but doesn't delete them on the SQS. So they remain in flight and after 3 times get the failed label:

[2018-12-23 13:21:54] Processing: App\Jobs\DispatchAwsGatewayJob
[2018-12-23 13:22:56] Processing: App\Jobs\DispatchAwsGatewayJob

[2018-12-23 13:27:55] Processing: App\Jobs\DispatchAwsGatewayJob
[2018-12-23 13:29:01] Processing: App\Jobs\DispatchAwsGatewayJob

[2018-12-23 13:34:00] Processing: App\Jobs\DispatchAwsGatewayJob
[2018-12-23 13:35:06] Processing: App\Jobs\DispatchAwsGatewayJob

[2018-12-23 13:40:05] Processing: App\Jobs\DispatchAwsGatewayJob
[2018-12-23 13:40:05] Failed:     App\Jobs\DispatchAwsGatewayJob

[2018-12-23 13:41:10] Processing: App\Jobs\DispatchAwsGatewayJob
[2018-12-23 13:41:10] Failed:     App\Jobs\DispatchAwsGatewayJob

Some other queue details:

Default Visibility Timeout: 30 seconds
Message Retention Period:   4 days
Receive Message Wait Time:  0 seconds

Is the sleep code perhaps interferring with the fifo queue? I don't have any other options to limit the jobs on queue....


Solution

  • Jobs have a default timeout of 60 seconds. Once the timeout is reached, the worker is killed and the job returns to the queue, unless the max attempts has been reached, in which case the job will become failed.

    You can increase the timeout for jobs either at the queue worker level, or at the job level.

    If you want to increase the timeout for the queue worker, specify the timeout on the command line:

    php artisan queue:work --timeout=300
    

    If you want to increase the timeout for a specific job, set the $timeout property on the job class:

    public $timeout = 300;
    

    However, the next problem is the Default Visibility Timeout. Since it is only set to 30 seconds, the job will become available on the queue again, and another worker could pick it up and run it. If you only ever have one worker, this isn't much of an issue. But, you'll want to fix it now so you don't run into issues later when you decide to add more workers.

    The Default Visibility Timeout should be higher than the max amount of time you expect any of your jobs to run. So, if the $timeout is 300, the Default Visibility Timeout should be > 300.