I am trying to calculate all the distances between approximately a hundred thousand points. I have the following code written in Fortran and compiled using f2py
:
C 1 2 3 4 5 6 7
C123456789012345678901234567890123456789012345678901234567890123456789012
subroutine distances(coor,dist,n)
double precision coor(n,3),dist(n,n)
integer n
double precision x1,y1,z1,x2,y2,z2,diff2
cf2py intent(in) :: coor,dist
cf2py intent(in,out):: dist
cf2py intent(hide)::n
cf2py intent(hide)::x1,y1,z1,x2,y2,z2,diff2
do 200,i=1,n-1
x1=coor(i,1)
y1=coor(i,2)
z1=coor(i,3)
do 100,j=i+1,n
x2=coor(j,1)
y2=coor(j,2)
z2=coor(j,3)
diff2=(x1-x2)*(x1-x2)+(y1-y2)*(y1-y2)+(z1-z2)*(z1-z2)
dist(i,j)=sqrt(diff2)
100 continue
200 continue
end
I am compiling the fortran code using the following python code setup_collision.py
:
# System imports
from distutils.core import *
from distutils import sysconfig
# Third-party modules
import numpy
from numpy.distutils.core import Extension, setup
# Obtain the numpy include directory. This logic works across numpy versions.
try:
numpy_include = numpy.get_include()
except AttributeError:
numpy_include = numpy.get_numpy_include()
# simple extension module
collision = Extension(name="collision",sources=['./collision.f'],
include_dirs = [numpy_include],
)
# NumyTypemapTests setup
setup( name = "COLLISION",
description = "Module calculates collision energies",
author = "Stvn66",
version = "0.1",
ext_modules = [collision]
)
Then running it as follows:
import numpy as np
import collision
coor = np.loadtxt('coordinates.txt')
n_atoms = len(coor)
dist = np.zeros((n_atoms, n_atoms), dtype=np.float16) # float16 reduces memory
n_dist = n_atoms*(n_atoms-1)/2
n_GB = n_dist * 2 / float(2**30) # 1 kB = 1024 B
n_Gb = n_dist * 2 / 1E9 # 1 kB = 1000 B
print 'calculating %d distances between %d atoms' % (n_dist, n_atoms)
print 'should use between %f and %f GB of memory' % (n_GB, n_Gb)
dist = collision.distances(coor, dist)
Using this code with 30,000 atoms, what should use around 1 GB of memory to store the distances, it instead uses 10 GB. With this difference, performing this calculation with 100,000 atoms will require 100 GB instead of 10 GB. I only have 20 GB in my computer.
Am I missing something related to passing the data between Python and Fortran? The huge difference indicates a major flaw in the implementation.
You are feeding double precision arrays to the Fortran subroutine. Each element in double precision requires 8 Byte of memory. For N=30,000
that makes
coor(n,3) => 30,000*3*8 ~ 0.7 MB
dist(n,n) => 30,000^2*8 ~ 6.7 GB
Since the half precision floats are additionally required for Python, that accounts for another 1-2GB. So the overall requirement is 9-10GB.
The same holds true for N=100,000
, which will require ~75GB for the Fortran part alone.
Instead of double precision
floats, you should use single precision real
s - if that is sufficient for your calculations. This will lead to half the memory requirements. [I have no experience with that, but I assume that if both parts use the same precision, Python can operate on the data directly...]
As @VladimirF noted in his comment, "usual compilers do not support 2 byte reals". I checked with gfortran
and ifort
, and they both do not. So you need to use at least single precision.