python python-3.x optimization memory memory-management

How can I sort a dictionary by value without creating even a temporary copy

I have a dictionary, which I need to sort by key. I physically do not have enough memory to do it.

x = dict(sorted(x.items(), key=lambda item: item[1])

Is there a way to sort it without creating even a temporary spike in memory usage. I though I could maybe use pop(), to remove items from the original, to keep the same amount of data in memory? But I don't know if there is a simpler way to do it.

My dictionary is about 10^8 objects, which is taking up about 100 Gb. I have about 20-25 Gb of memory free.

Solution

Not no extra space, but much less, so maybe still useful for you or others with the same problem. Memory measurements (peak bytes) with 10^5 items:

12,059,636 baseline
24,408,256 original
19,065,304 better 1
18,709,352 better 2
14,265,296 sort keys, sort vals
12,949,792 sort keys, sort vals 2

baseline is for creating the original dict. Your original solution peaks at additional 12.3 MB. My best alternative peaks at additional 0.9 MB.

Code (Try it online!):

import tracemalloc as tm
from random import random
import gc

n = 10**5

def start(label):
  global x, label_
  label_ = label
  gc.collect()
  tm.start()
  x = {random(): random() for _ in range(n)}

def stop():
  global x
  print(f'{tm.get_traced_memory()[1]:10,}', label_)
  tm.stop()
  if label_ != 'baseline':
    assert len(x) == n
    assert list(x.values()) == sorted(x.values()), list(x.values())
  del x
  gc.collect()

for _ in range(2):

  start('baseline')
  stop()

  start('original')
  x = dict(sorted(x.items(), key=lambda item: item[1]))
  stop()

  start('better 1')
  x = list(x.items())
  x.sort(key=lambda item: item[1])
  x = dict(x)
  stop()

  start('better 2')
  ks = list(x)
  ks.sort(key=x.get)
  x = dict(zip(ks, map(x.pop, ks)))
  stop()

  start('sort keys, sort vals')
  keys = list(x)
  keys.sort(key=x.get)
  vals = list(x.values())
  del x
  vals.sort()
  x = dict(zip(keys, vals))
  stop()

  start('sort keys, sort vals 2')
  keys = list(x)
  keys.sort(key=x.get, reverse=True)
  vals = list(x.values())
  del x
  vals.sort(reverse=True)
  x = {}
  while keys:
    x[keys.pop()] = vals.pop()
  stop()

  print()