I was just following along an example given in a book to illustrate the Python shelve
module on macOS High Sierra.
As shown below only two small tuples of short strings get stored in a shelf. And as you can see in the very last line, the resulting file is 16 Megabyte large.
The resulting file only gets that large when I try the example on macOS High Sierra with the Python version installed through Homebrew (either 3.6.4 or 2.7.14). If I run it on a Linux host or with the pre-installed Python version (2.7.10) or with Python 3.6.4 installed through the official installer in macOS, the resulting addresses
file is just a few Kilobyte large, just as reported by others in the comments (thanks!).
~/tmp> rm addresses
~/tmp> python3
Python 3.6.4 (default, Jan 6 2018, 18:43:09)
[GCC 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.39.2)] on darwin
[...]
>>> import shelve
>>> book = shelve.open("addresses")
>>> book['flintstone'] = ('fred', '555-1234', '1233 Bedrock Place')
>>> book['rubble'] = ('barney', '555-4321', '1235 Bedrock Place')
>>> book.close()
>>>
~/tmp> ll
total 32768
-rw-r--r-- 1 moritz staff 16M Jan 24 13:05 addresses
I could confirm this behavior is introduced by gdbm 1.14, gdbm is the library used by shelve
to access database file.
With change 2e8a5e0, gdbm will try to extend file size to match next_block_size
. next_block_size
is calculated by 4 * block_size
,
which is the optimal I/O block size of underlying filesystem, obtained by stat.st_blksize
returned by stat(2)
. On my macOS 10.13.3, a file on APFS on SSD volume, stat.st_blksize
is 4194304 bytes, next_block_size
is 16777216 bytes, therefore the init db file size is 16MB.
ps: I examined an HFS+ fs on an HDD volume at my hand, st_blksize
value is 4096 bytes.