Faster repetitive uses of bz2.BZ2File for pickling

I'm pickling multiple objects repeatedly, but not consecutively. But as it turned out, pickled output files were too large (about 256MB each).

So I tried bz2.BZ2File instead of open, and each file became 1.3MB. (Yeah, wow.) The problem is that it takes too long (like 95 secs pickling one object) and I want to speed it up.

Each object is a dictionary, and most of them have similar structures (or hierarchies, if that describes it better: almost the same set of keys, and each value that corresponds to each key normally has some specific structure, and so on). Many of the dictionary values are numpy arrays, and I think many zeros will appear there.

Can you give me some advice to make it faster?

Thank you!

Solution

I ended up using lz4, which is a blazingly fast compression algorithm.

There is a python wrapper, which can be installed easily:

pip install lz4

Sympy - split polynomial into two parts, positive and negative
Transform code to list perfect numbers from for loop to while loops
Command to uninitialize a Git repo in Windows
Do I need to use scaler even if my dataframe has fairly normalized data within a specific range
RuleBasedCollator rule ignored
How can I make a virtual environment work with pyenv?
Python split() function :: Need to split "int_32\n' " so that I get int_32 alone
Background threads stoping
Pair data located in the same string, AWK or other
how to insert 64 bytes FF to a flash file every 2048 bytes with python
C# to Python RSA implement
How to display additional count near progress bar in Enquiry Screen?
How can i fix this without any external librarys?
Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas
resize with averaging or rebin a numpy 2d array
What does the "yield" keyword do in Python?
MissingGreenlet: greenlet_spawn has not been called
Python, want logging with log rotation and compression
Understanding and Fixing the regex?
Does Python make a copy of objects on assignment?
Alternative to .concat() of empty dataframe, now that it is being deprecated?
QML ListView sections from the code
Save MS ACCESS attachments with python
How to make Pareto chart in python?
Format string with custom delimiters
How to return the fractional part of a number?
How to specify conda env in Python Debugger in VScode
How do I type the `__prepare__` method for a metaclass?
Referencing row values in pyodbc when column name contains dashes (hyphens)
Figure out if a business name is very similar to another one - Python