Search code examples
djangoscikit-learndjango-cachejoblib

how to achieve faster tfidfvectorizer loading times from within a django view?


I have a fitted TfidfVectorizer with ~120,000 features which I save to file using joblib.dump. I later load that model, from within a django view, using joblib.load but it is too slow (takes ~2 seconds). What is the best way to improve the loading speed? Should I cache the model using django's caching framework? Should I compress the model when serializing with joblib.dump? Is there a way to load the model into memory once and keep it there rather than reloading it each time the view is called?


Solution

  • The model does not change between requests, therefore, we want to load it into memory once and leave it there. This can be achieved, in views.py by loading the model and assigning it to global variable.