PyPy large memory usage compared to CPython

I used python to solve SPOJ's large input test problem and met with a very strange occurrence. I submitted the same code using PyPy and Python 2. The results are shown below:

The code ran much faster using PyPy compared to CPython, as expected. But at the same time, the memory usage increased by a whopping 7 times! I did a search on the web but I was unable to find any evidence that suggests that PyPy's memory usage is much more than CPython. Could somone please explain the huge difference in memory usage?

I have also considered that it could be because of my code. Hence, I have posted my code below:

import io, sys, atexit, os
sys.stdout = io.BytesIO()
atexit.register(lambda: sys.__stdout__.write(sys.stdout.getvalue()))
sys.stdin = io.BytesIO(sys.stdin.read())
raw_input = lambda: sys.stdin.readline().rstrip()

line = list(map(int,raw_input().split()))
num, k = line
ans = 0

for i in xrange(0,num):
    if int(raw_input())%k == 0:
        ans += 1;

print(ans)

Could someone please advise me?

Solution

First, I was not able to reproduce the results. Don't know which versions/set-ups are used by SPOJ. For the following experiments, PyPy 5.8.0 and CPython 2.7.12 were used.

As test case, the largest possible input file with size of about 110MB was used:

#create_data.py
print 10**6, 33
for i in xrange(10**6):
  print 10**9

>> python create_data.py > input.in

Now running /usr/bin/time -v XXX solution.py < input.py yields:

Interpreter     MaximalResidentSize 
PyPy:                 278 Mb
CPython:              222 Mb

PyPy needs a little bit more memory. CPython and PyPy use different garbage collector strategies and I think PyPy's trade-off is to be faster but to use more memory. The guys from PyPy have a great article about their garbage collector and its comparison to CPython.

Second, I don't trust the numbers from the SPJO-site. system.stdin.read() will read the whole file into memory. The python documentation even says:

To read a file’s contents, call f.read(size), which reads some quantity of data and returns it as a string. size is an optional numeric argument. When size is omitted or negative, the entire contents of the file will be read and returned; it’s your problem if the file is twice as large as your machine’s memory.

Under the assumption, that the worst case was include into their test cases, the memory usage should be at least the size of the file (110 MB) as you use std.stdin.read() and even twice the size, because you are coping the data.

Actually, I'm not sure, the whole trouble is worth it - using raw_input() is probably fast enough - I would just trust python to do The Right Thing. CPython normally buffers stdout and stdin (fully buffered if they are redirected to files, or line-buffered for the console) and you have to use command line option -u to switch it off.

But if you really wanna be sure, you can use the file-object iterators of sys.stdin, because as CPython man pages state:

-u Force stdin, stdout and stderr to be totally unbuffered. On systems where it matters, also put stdin, stdout and stderr in binary mode. Note that there is internal buffering in xread‐ lines(), readlines() and file-object iterators ("for line in sys.stdin") which is not influenced by this option. To work around this, you will want to use "sys.stdin.readline()" inside a "while 1:" loop.

That means your program could look like this:

import sys
num, k = map(int,raw_input().split())
ans = 0    
for line in sys.stdin:
    if int(line)%k == 0:
        ans += 1
print(ans)

This has the big advantage that only around 7MB memory are used for this variant.

Another lessons is that you should not use sys.stdin.readline() if your are afraid, that somebody runs your program in the unbuffered mode.

some further experiments (with my cpu clocked down)

                   CPython        CPython -u         PyPy         PyPy -u
original        28sec/221MB      25sec/221MB       3sec/278MB    3sec/278MB
raw_input()     29sec/7MB        110sec/7MB        7sec/75MB    100sec/63MB
readline()     38sec/7MB        130sec/7MB        5sec/75MB    100sec/63MB
readlines()    20sec/560MB      20sec/560MB       4sec/1.4GB    4sec/1.4G
file-iterator    17sec/7MB       17sec/7MB         4sec/68MB    100sec/62MB

There are some takeaways:

raw_input() and sys.stdin.read_line() have identical performances
raw_input() is buffered, but this buffer seems to be a little bit different as the buffer for file-object iterator, which outperforms raw_input() at least for this file.
memory-overhead of sys.stdin.readlines() seems to be pretty hight, at least as long as the lines are short.
file-object iterator has different behavior in CPython and PyPy, if option -u is used: for PyPy -u switches off also the buffering for file-object iterator (maybe a bug?).