Search code examples
pythonpytorchcelery

Celery worker exited prematurely signal 11: trying to run a python script on button click from Django view


I am working on a Django app whose partial process is transcribing audio with timestamps. When a user clicks on a button from a web interface, the Django server launches a Python script that helps with transcribing.

Now, here are a few approaches I have tried already: I have a separate transcribe.py file. When a user clicks the transcribe button from the web page, it accesses a view from the project app. However, after partially running the script, the Django server terminates from the terminal.

Since the Python script is a long-running process, I figured I should run the program in the background so the Django server doesn't terminate. So, I implemented Celery and Redis. First, the transcribe.py script runs perfectly well when I run it from the Django shell. However, it terminates once again when I try to execute it from the view/web page.

python manage.py shell

Since I implemented the celery worker part, the server doesn't terminate but the worker throws the following error.

[tasks]
  . transcribeApp.tasks.run_transcription

[2024-11-25 03:26:04,500: INFO/MainProcess] Connected to redis://localhost:6379/0
[2024-11-25 03:26:04,514: INFO/MainProcess] mingle: searching for neighbors
[2024-11-25 03:26:05,520: INFO/MainProcess] mingle: all alone
[2024-11-25 03:26:05,544: INFO/MainProcess] [email protected] ready.
[2024-11-25 03:26:16,253: INFO/MainProcess] Task searchApp.tasks.run_transcription[c684bdfa-ec21-4b4e-9542-0ca1f7729682] received
[2024-11-25 03:26:16,255: INFO/ForkPoolWorker-15] Starting transcription process.
[2024-11-25 03:26:16,509: WARNING/ForkPoolWorker-15] /Users/user/Desktop/project/django_app/django_venv/lib/python3.12/site-packages/whisper/__init__.py:150: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(fp, map_location=device)

[2024-11-25 03:26:16,670: ERROR/MainProcess] Process 'ForkPoolWorker-15' pid:38956 exited with 'signal 11 (SIGSEGV)'
[2024-11-25 03:26:16,683: ERROR/MainProcess] Task handler raised error: WorkerLostError('Worker exited prematurely: signal 11 (SIGSEGV) Job: 0.')
Traceback (most recent call last):
  File "/Users/user/Desktop/project/django_app/django_venv/lib/python3.12/site-packages/billiard/pool.py", line 1265, in mark_as_worker_lost
    raise WorkerLostError(
billiard.einfo.ExceptionWithTraceback: 
"""
Traceback (most recent call last):
  File "/Users/user/Desktop/project/django_app/django_venv/lib/python3.12/site-packages/billiard/pool.py", line 1265, in mark_as_worker_lost
    raise WorkerLostError(
billiard.exceptions.WorkerLostError: Worker exited prematurely: signal 11 (SIGSEGV) Job: 0.
"""

The implementation looks like this,

# Views.py
from . import tasks
from django.shortcuts import render
from django.http import HttpResponse, JsonResponse

def trainVideos(request):
    try:
        tasks.run_transcription.delay()
        return JsonResponse({"status": "success", "message": "Transcription has started check back later."})
    # return render(request, 'embed.html', {'data': data})
    except Exception as e:
        JsonResponse({"status": "error", "message": str(e)})

Here is what the transcribe function looks like, where the celery worker throws the worker exited prematurely error.

# Add one or two audios possibly .wav, .mp3 in a folder,
# and provide the file path here.
# transcribe.py 

import whisper_timestamped as whisper
import os
def transcribeTexts(model_id, filePath):
    result = []
    fileNames = os.listdir()
    
    model = whisper.load_model(model_id)

    for files in fileNames:
        audioPath = filePath + "/" + files

        audio = whisper.load_audio(audioPath)

        result.append(model.transcribe(audio, language="en"))
    
    return result
 model_id = "tiny"
 audioFilePath = path/to/audio
 transcribeTexts(model_id, audioFilePath)

Install the following libraries to reproduce the problem:

 pip install openai-whisper
 pip3 install whisper-timestamped
 pip install Django
 pip install celery redis
 pip install redis-server

The Celery Implementation: # celery.py from project main_app directory

from __future__ import absolute_import, unicode_literals
import os
from celery import Celery

os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'main_app.settings')

app = Celery('main_app')

app.config_from_object('django.conf:settings', namespace='CELERY')

app.autodiscover_tasks()

def debug_tasks(self):
    print(f"Request: {self.request!r}")

tasks.py from the transcribe_app directory:

from __future__ import absolute_import, unicode_literals
from . import transcribe
from celery import shared_task

@shared_task
def run_transcription():
    transcribe.transcribe()
    return "Transcription Completed..."

The settings.py is also updated with the following:

CELERY_BROKER_URL = 'redis://localhost:6379/0'
CELERY_BROKER_CONNECTION_RETRY_ON_STARTUP = True

Also, modified the init.py file from django_app

from __future__ import absolute_import, unicode_literals

from .celery import app as celery_app

__all__ = ('celery_app',) 

For this application, some of the libraries are dependent on particular versions. All libraries and packages are listed below:

Package              Version
-------------------- -----------
amqp                 5.3.1
asgiref              3.8.1
billiard             4.2.1
celery               5.4.0
certifi              2024.8.30
charset-normalizer   3.3.2
click                8.1.7
click-didyoumean     0.3.1
click-plugins        1.1.1
click-repl           0.3.0
Cython               3.0.11
Django               5.1.2
django-widget-tweaks 1.5.0
dtw-python           1.5.3
faiss-cpu            1.9.0
ffmpeg               1.4
filelock             3.16.1
fsspec               2024.9.0
huggingface-hub      0.25.2
idna                 3.10
Jinja2               3.1.4
kombu                5.4.2
lfs                  0.2
llvmlite             0.43.0
MarkupSafe           3.0.1
more-itertools       10.5.0
mpmath               1.3.0
msgpack              1.1.0
networkx             3.3
numba                0.60.0
numpy                2.0.2
packaging            24.1
panda                0.3.1
pillow               10.4.0
pip                  24.3.1
prompt_toolkit       3.0.48
pydub                0.25.1
python-dateutil      2.9.0.post0
PyYAML               6.0.2
redis                5.2.0
regex                2024.9.11
requests             2.32.3
safetensors          0.4.5
scipy                1.14.1
semantic-version     2.10.0
setuptools           75.1.0
setuptools-rust      1.10.2
six                  1.16.0
sqlparse             0.5.1
sympy                1.13.3
tiktoken             0.8.0
tokenizers           0.20.1
torch                2.4.1
torchaudio           2.4.1
torchvision          0.19.1
tqdm                 4.66.5
transformers         4.45.2
txtai                7.4.0
typing_extensions    4.12.2
tzdata               2024.2
urllib3              2.2.3
vine                 5.1.0
wcwidth              0.2.13
whisper-timestamped  1.15.4

Overall, when I run the program independently, it works perfectly fine. But within Django, it just terminates however I execute it. I thought one of the reasons might be since I am loading long audios, so I chunked it and tried to run the transcribe.py program using the user interface; however, it's the same thing worker exited prematurely, signal 11 (SIGSEGV) Job: 0. I tried changing memory pool size to a higher level for a worker, didn't work. I am unsure exactly what needs to be done to run the transcribe.py file within Django since most known methods are not working for me. I may have missed something, so please help me figure this out. Thank you for your time.


Solution

  • sigsegv often comes when you try to access memory that's not accessible by your program, see here. I could re-create the code and it worked completely fine on my end. Here are the probable reasons why this happened to you:

    • The pool type you specified in your celery command didn't workout successfully, --pool=solo seems to work since it doesn't fork the process.
    • Part of the code is executed as root and other parts aren't.
    • The file path you provided isn't correct, or it exists with wrong permissions.
    • Maybe you're executing this on a virtual machine with very limited ram, thus no memory is available since the AI model and libraries you've loaded are already heavy?
    • There's an actual problem with libc on your machine or Celery itself, but the problem isn't clear.

    I'll walk you through how I re-created your code, and maybe you made a typo or a little mistake that resulted in the error you mentioned.

    django-admin startproject project101
    cd project101
    python3 manage.py startapp app101
    

    project101/urls.py:

    from django.contrib import admin
    from django.urls import path, include
    
    urlpatterns = [
        path('admin/', admin.site.urls),
        path('', include("app101.urls"))
    ]
    

    project101/settings.py:

    INSTALLED_APPS = [
        # ...
        
        'app101'
    ]
    
    # put this at the end of settings.py
    CELERY_BROKER_URL = 'redis://localhost:6379/0'
    CELERY_BROKER_CONNECTION_RETRY_ON_STARTUP = True
    

    project101/celery.py

    from __future__ import absolute_import, unicode_literals
    import os
    from celery import Celery
    
    os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'project101.settings')
    
    app = Celery('project101')
    
    app.config_from_object('django.conf:settings', namespace='CELERY')
    
    app.autodiscover_tasks()
    
    def debug_tasks(self):
        print(f"Request: {self.request!r}")
    
    

    project101/init.py:

    from __future__ import absolute_import, unicode_literals
    
    from .celery import app as celery_app
    
    __all__ = ('celery_app',) 
    

    app101/views.py:

    from . import tasks
    from django.shortcuts import render
    from django.http import HttpResponse, JsonResponse
    
    def trainVideos(request):
        try:
            tasks.run_transcription.delay()
            return JsonResponse({"status": "success", "message": "Transcription has started check back later."})
        # return render(request, 'embed.html', {'data': data})
        except Exception as e:
            JsonResponse({"status": "error", "message": str(e)})
    
    

    app101/urls.py:

    from django.urls import path, include
    from . import views
    
    urlpatterns = [
        path('transcribe', views.trainVideos)
    ]
    
    

    app101/tasks.py:

    from __future__ import absolute_import, unicode_literals
    from . import transcribe
    from celery import shared_task
    
    @shared_task
    def run_transcription():
        transcribe.transcribe()
        return "Transcription Completed..."
    
    

    app101/transcribe.py:

    
    import whisper_timestamped as whisper
    import os
    
    def transcribeTexts(model_id, audio_directory_path):
        result = []
        fileNames = os.listdir(audio_directory_path)
        
        model = whisper.load_model(model_id)
    
        for files in fileNames:
            print(files)
            audioPath = audio_directory_path + "/" + files
    
            audio = whisper.load_audio(audioPath)
    
            result.append(model.transcribe(audio, language="en"))
        print(result)
        return result
    
    def transcribe():
        model_id = "tiny"
        audio_directory_path = 'audio_sample'
        transcribeTexts(model_id, audio_directory_path)
    
    

    Note that audio_sample is a folder outside app101, it has the same level as app101 and project101. You could make it in another folder but make sure to specify the correct directory path. I've added directory structure below.

    .
    ├── app101
    │   ├── admin.py
    │   ├── apps.py
    │   ├── __init__.py
    │   ├── migrations
    │   ├── models.py
    │   ├── __pycache__
    │   ├── tasks.py
    │   ├── tests.py
    │   ├── transcribe.py
    │   ├── urls.py
    │   └── views.py
    ├── audio_sample
    │   └── some_audio.mp3
    ├── db.sqlite3
    ├── manage.py
    └── p101
        ├── asgi.py
        ├── celery.py
        ├── __init__.py
        ├── __pycache__
        ├── settings.py
        ├── urls.py
        └── wsgi.py
    
    

    After this, run the following commands on separate terminals:

    python3 manage.py runserver
    
    celery -A project101 worker --pool=solo -l info
    

    This should make your project up and running. To test, send a get request to http://localhost:8000/transcribe or simply open it in your browser.

    Note the following:

    • This was just to walk you through how to successfully run celery, don't forget to implement the code in your project and make migrations accordingly.
    • You can run the Celery command with different arguments, such as changing pool from solo to gevent. --pool=solo seems to work fine.
    • Execute everything as the same user, either root (not really recommended) or normal user.
    • Make sure all files have correct permissions.