Is there any way to have a function in Python that can walk a folder with a list of files & separate the list into "partitions" (which will become folders) based on total size of the files in each partition/folder in megabytes? I'm not sure how to start with this or what to do first.
Assuming you want a starting point, not a solution in a can:
os.walk
to scan a whole directory tree. If you only need to scan one folder, not a whole tree, you can optimize a bit without sacrificing simplicity (particularly on Windows) on Python 3.5 with the new os.scandir
function that will give you stat
info for free on Windows (and make it accessible as a lazily cached value on *NIX systems). On earlier versions of Python, a third party scandir
module on PyPI provides the same interface.os.scandir
, you'd use os.stat
to get file sizescollections.defaultdict(set)
to map from file sizes in MB to a set
of files that round to that size (or just process the files as you go instead of storing in a container at all). Alternatively, sort with sorted
key
-ed on the size and use itertools.groupby
(with whatever MB granularity you like) to group the resulting files.