When I use pip install pillow==10.2.0
to install a Pillow on my machine, the original filename from server is pillow-10.2.0-cp312-cp312-win_amd64.whl .
After the download, it is cached in my local disk:
C:\Users\chj\AppData\Local\pip\cache\http-v2\5\d\a\8\5\5da855ed79847734593562113083d79c2634b1696f0d635c02984eb4.body
I'd like to know, how is that ID string(as cache filename) 5da855ed79847734593562113083d79c2634b1696f0d635c02984eb4
calculated?
It is not the SHA224 or SHA256 of the .body file.
sha224(5da855ed79847734593562113083d79c2634b1696f0d635c02984eb4.body)=98dede132d9782d07fd42cf70d1734984f8bfd5c60c6018903053d68
sha256(5da855ed79847734593562113083d79c2634b1696f0d635c02984eb4.body)=154e939c5f0053a383de4fd3d3da48d9427a7e985f58af8e94d0b3c9fcfcf4f9
Then what is it?
The ID string is the sha224 hash of the url in hex:
$ echo -n 'https://files.pythonhosted.org/packages/51/07/7e9266a59bb267b56c1f432f6416653b9a78dda771c57740d064a8aa2a44/pillow-10.2.0-cp312-cp312-win_amd64.whl' | openssl sha224 -hex
SHA2-224(stdin)= 5da855ed79847734593562113083d79c2634b1696f0d635c02984eb4
You didn't specify the version of pip you are looking at so I checked out the latest version from https://github.com/pypa/pip and searched for '.body' and found two references. The most promising appears to be file_cache.py:
def get_body(self, key: str) -> IO[bytes] | None:
name = self._fn(key) + ".body"
try:
return open(name, "rb")
except FileNotFoundError:
return None
and then you just follow _fn()
to encode()
and finally the call to get_body(cache_url)
:
def encode(x: str) -> str:
return hashlib.sha224(x.encode()).hexdigest()
def _fn(self, name: str) -> str:
# NOTE: This method should not change as some may depend on it.
# See: https://github.com/ionrock/cachecontrol/issues/63
hashed = self.encode(name)
parts = list(hashed[:5]) + [hashed]
return os.path.join(self.directory, *parts)
body_file = self.cache.get_body(cache_url)