Search code examples
linuxencodingpipproxyiso-8859-1

pip mangles password for proxy


I am trying to install python packages on an embedded device with some form of yocto linux. I managed to connect this device to the proxy at work. After setting http[s]_proxy accordingly, wget http://www.google.de works and downloads googles landing page. However, pip still does not comply.

Executing pip3 install seaborn yields

ERROR: Exception:
Traceback (most recent call last):
  File "/usr/lib/python3.7/site-packages/pip/_internal/cli/base_command.py", line 188, in main
    status = self.run(options, args)
  File "/usr/lib/python3.7/site-packages/pip/_internal/commands/install.py", line 345, in run
    resolver.resolve(requirement_set)
  File "/usr/lib/python3.7/site-packages/pip/_internal/legacy_resolve.py", line 196, in resolve
    self._resolve_one(requirement_set, req)
  File "/usr/lib/python3.7/site-packages/pip/_internal/legacy_resolve.py", line 359, in _resolve_one
    abstract_dist = self._get_abstract_dist_for(req_to_install)
  File "/usr/lib/python3.7/site-packages/pip/_internal/legacy_resolve.py", line 307, in _get_abstract_dist_for
    self.require_hashes
  File "/usr/lib/python3.7/site-packages/pip/_internal/operations/prepare.py", line 134, in prepare_linked_requirement
    req.populate_link(finder, upgrade_allowed, require_hashes)
  File "/usr/lib/python3.7/site-packages/pip/_internal/req/req_install.py", line 211, in populate_link
    self.link = finder.find_requirement(self, upgrade)
  File "/usr/lib/python3.7/site-packages/pip/_internal/index.py", line 1201, in find_requirement
    req.name, specifier=req.specifier, hashes=hashes,
  File "/usr/lib/python3.7/site-packages/pip/_internal/index.py", line 1183, in find_candidates
    candidates = self.find_all_candidates(project_name)
  File "/usr/lib/python3.7/site-packages/pip/_internal/index.py", line 1128, in find_all_candidates
    for page in self._get_pages(url_locations, project_name):
  File "/usr/lib/python3.7/site-packages/pip/_internal/index.py", line 1282, in _get_pages
    page = _get_html_page(location, session=self.session)
  File "/usr/lib/python3.7/site-packages/pip/_internal/index.py", line 234, in _get_html_page
    resp = _get_html_response(url, session=session)
  File "/usr/lib/python3.7/site-packages/pip/_internal/index.py", line 182, in _get_html_response
    "Cache-Control": "max-age=0",
  File "/usr/lib/python3.7/site-packages/pip/_vendor/requests/sessions.py", line 546, in get
    return self.request('GET', url, **kwargs)
  File "/usr/lib/python3.7/site-packages/pip/_internal/download.py", line 624, in request
    return super(PipSession, self).request(method, url, *args, **kwargs)
  File "/usr/lib/python3.7/site-packages/pip/_vendor/requests/sessions.py", line 533, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/lib/python3.7/site-packages/pip/_vendor/requests/sessions.py", line 646, in send
    r = adapter.send(request, **kwargs)
  File "/usr/lib/python3.7/site-packages/pip/_vendor/cachecontrol/adapter.py", line 53, in send
    resp = super(CacheControlAdapter, self).send(request, **kw)
  File "/usr/lib/python3.7/site-packages/pip/_vendor/requests/adapters.py", line 412, in send
    conn = self.get_connection(request.url, proxies)
  File "/usr/lib/python3.7/site-packages/pip/_vendor/requests/adapters.py", line 309, in get_connection
    proxy_manager = self.proxy_manager_for(proxy)
  File "/usr/lib/python3.7/site-packages/pip/_vendor/requests/adapters.py", line 192, in proxy_manager_for
    proxy_headers = self.proxy_headers(proxy)
  File "/usr/lib/python3.7/site-packages/pip/_vendor/requests/adapters.py", line 390, in proxy_headers
    password)
  File "/usr/lib/python3.7/site-packages/pip/_vendor/requests/auth.py", line 63, in _basic_auth_str
    password = password.encode('latin1')
UnicodeEncodeError: 'latin-1' codec can't encode character '\ufffd' in position 5: ordinal not in range(256)

I read from that, that somehow the replacement character \ufffd is inserted into my password. My password, as set into http_proxy does not contain non-ascii characters. The most "exotic" symbol is a percent sign "%".

More Details:

  • pip3 version 19.2.3
  • python version 3.7
  • http[s]_proxy=http[s]://[user]:[pass]@[proxy_url]:[proxy_port]

Using the --proxy option from pip does not help the problem.

How can I bypass this?


Solution

  • Special characters have to be percent / url encoded.

    After trying for two hours I finally found a solution. The hint came from this post.

    In the provided ini they separated the domain via the url encoding %5C instead of an actual /. Also they explain that special characters have to be percent encoded for pip.

    This was the core error - pip decodes the password as part of the url and trips, because the % is not followed by a valid hex value.

    Also just passing --proxy was not enough. pip tries to open subsequent HTTPS connections without that proxy argument for whatever reason, making it impossible to reach pypi.org. Therefore, you have to set your env variables http_proxy and https_proxy accordingly or possibly set the proxy in pip.ini as explained in the post mentioned above. Remember to use percent encoded special characters, if they are necessary.

    Since our proxy setup is a bit scuffed anyway, I installed with pip3 install --trusted-host files.pythonhosted.org --trusted-host pypi.org --trusted-host pypi.python.org -vv [package] to avoid any more mishaps with SSL / TLS handshaking.

    PS: Thanks @tink for your help. Unfortunately this yocto build had no locale and changing LANG to anything like de_DE, de_DE.UTF8 or en_US.UTF8 just completely stopped pip from parsing anything.