Search code examples
unicodepipgitlabstreamlit

Streamlit install fails on GitLab CI with UnicodeEncodeError


On GitLab CI, after upgrading streamlit to 1.10.0, I have the following error while running pip install streamlit:

ERROR: Exception:
Traceback (most recent call last):
  File "/builds/project/venv/lib/python3.6/site-packages/pip/_internal/cli/base_command.py", line 164, in exc_logging_wrapper
    status = run_func(*args)
  File "/builds/project/venv/lib/python3.6/site-packages/pip/_internal/cli/req_command.py", line 205, in wrapper
    return func(self, options, args)
  File "/builds/project/venv/lib/python3.6/site-packages/pip/_internal/commands/install.py", line 413, in run
    pycompile=options.compile,
  File "/builds/lproject/venv/lib/python3.6/site-packages/pip/_internal/req/__init__.py", line 81, in install_given_reqs
    pycompile=pycompile,
  File "/builds/project/venv/lib/python3.6/site-packages/pip/_internal/req/req_install.py", line 810, in install
    requested=self.user_supplied,
  File "/builds/project/venv/lib/python3.6/site-packages/pip/_internal/operations/install/wheel.py", line 737, in install_wheel
    requested=requested,
  File "/builds/project/venv/lib/python3.6/site-packages/pip/_internal/operations/install/wheel.py", line 589, in _install_wheel
    file.save()
  File "/builds/project/venv/lib/python3.6/site-packages/pip/_internal/operations/install/wheel.py", line 383, in save
    if os.path.exists(self.dest_path):
  File "/builds/project/venv/lib/python3.6/genericpath.py", line 19, in exists
    os.stat(path)
UnicodeEncodeError: 'ascii' codec can't encode character '\U0001f4f9' in position 76: ordinal not in range(128)

I checked the faulty encoded character and it corresponds to the video camera emoji 📹 = \U0001f4f9.

How can I solve it?


Solution

  • The bug has been introduced by Streamlit 1.10.0. The video camera emoji has been added to the streamlit hello command after the introduction of the multi-page feature: https://github.com/streamlit/streamlit/tree/release/1.10.0/lib/streamlit/hello/pages

    In my CI, the encodings were the following:

    $ locale || true
    LANG=
    LANGUAGE=
    LC_CTYPE="POSIX"
    LC_NUMERIC="POSIX"
    LC_TIME="POSIX"
    LC_COLLATE="POSIX"
    LC_MONETARY="POSIX"
    LC_MESSAGES="POSIX"
    LC_PAPER="POSIX"
    LC_NAME="POSIX"
    LC_ADDRESS="POSIX"
    LC_TELEPHONE="POSIX"
    LC_MEASUREMENT="POSIX"
    LC_IDENTIFICATION="POSIX"
    LC_ALL=
    

    I reset them using:

    before_script:
        - apt-get install -y locales
        - echo "en_US UTF-8" > /etc/locale.gen
        - locale-gen en_US.UTF-8
        - export LANG=en_US.UTF-8
        - export LANGUAGE=en_US:en
        - export LC_ALL=en_US.UTF-8
    

    and now it works.

    Thanks to Bill's answer.