Search code examples
pythonpython-3.xairflowvalueerror

Airflow fails to start due to a ValueError coming from the future package


I was previously able to run the airflow webserver and airflow scheduler from my particular anaconda environment, but now when I run those commands I get a ValueError and neither server starts.

The error is as follows:

(airflow_test) guy-mbp:airflow guy$ airflow webserver
Traceback (most recent call last):
  File "/Users/guy/miniconda3/envs/airflow_test/bin/airflow", line 25, in <module>
    from airflow.configuration import conf
  File "/Users/guy/miniconda3/envs/airflow_test/lib/python3.7/site-packages/airflow/__init__.py", line 31, in <module>
    from airflow.utils.log.logging_mixin import LoggingMixin
  File "/Users/guy/miniconda3/envs/airflow_test/lib/python3.7/site-packages/airflow/utils/__init__.py", line 24, in <module>
    from .decorators import apply_defaults as _apply_defaults
  File "/Users/guy/miniconda3/envs/airflow_test/lib/python3.7/site-packages/airflow/utils/decorators.py", line 34, in <module>
    from airflow import settings
  File "/Users/guy/miniconda3/envs/airflow_test/lib/python3.7/site-packages/airflow/settings.py", line 36, in <module>
    from airflow.configuration import conf, AIRFLOW_HOME, WEBSERVER_CONFIG  # NOQA F401
  File "/Users/guy/miniconda3/envs/airflow_test/lib/python3.7/site-packages/airflow/configuration.py", line 29, in <module>
    from future import standard_library
ValueError: source code string cannot contain null bytes

The python version used is 3.7.5, and the apache-airflow version is 1.10.6.

I recently downloaded some new packages into the environment. Could they have caused this issue to arise?


Solution

  • I found out what was causing my issue.

    The current version of tensorflow is missing a function, tokenizer_from_json. This function is available in keras.preprocessing.text but not in tensorflow.keras.preprocessing.text. To try and get the function I installed keras. The installation caused a conflict between keras and tensorflow that was resolved very poorly by conda. The conflict resulted in the error in the question above.

    To solve the issue, I first saved the environment file using the following command

    conda env export > environment.yml
    

    Then I removed the environment using

    conda env remove --name airflow_test
    

    Then I created a fresh environment in ananconda and install all the packages I needed except keras.

    After more digging into tensorflow I found that when I install tensorflow the keras package is installed to some degree as a dependancy and it is available through keras_preprocessing which can be used like so

    from keras_preprocessing.text import tokenizer_from_json
    

    Hope this helps somebody.