In doing some performance testing in Python, I compared the timing for different methods to calculate the Euclidean distance between an array of coordinates. I found my Fortran code compiled with F2PY to be roughly 4x slower than the C implementation used by SciPy. Comparing that C code, to my Fortran code I see no fundamental difference that would lead to the factor of 4 difference. Here is my code (with some comments explaining its use):
subroutine distance(coor,dist,n)
double precision coor(n,3),dist(n,n)
integer n,i,j
double precision xij,yij,zij
cf2py intent(in):: coor,n
cf2py intent(in,out):: dist
cf2py intent(hide):: xij,yij,zij,
do 200,i=1,n-1
do 300,j=i+1,n
300 continue
200 continue
c 1 2 3 4 5 6 7
c to setup and incorporate into python (requires numpy):
c # python build
c # cp build/lib*/ ./
c to call this from python add the following lines:
c >>> import sys ; sys.path.append('./')
c >>> from distance import distance
c >>> dist = distance(coor, dist)
Looking at the compile command run by F2PY, I recognized there is no avx
compile flag. I tried adding it in the Python setup file using extra_compile_args=['-mavx
]` but this had no change to the compile command run by F2PY:
compiling Fortran sources
Fortran f77 compiler: /usr/bin/gfortran -Wall -g -ffixed-form -fno-second-underscore -fPIC -O3 -funroll-loops
Fortran f90 compiler: /usr/bin/gfortran -Wall -g -fno-second-underscore -fPIC -O3 -funroll-loops
Fortran fix compiler: /usr/bin/gfortran -Wall -g -ffixed-form -fno-second-underscore -Wall -g -fno-second-underscore -fPIC -O3 -funroll-loops
compile options: '-I/home/user/anaconda/lib/python2.7/site-packages/numpy/core/include -Ibuild/src.linux-x86_64-2.7 -I/home/user/anaconda/lib/python2.7/site-packages/numpy/core/include -I/home/user/anaconda/include/python2.7 -c'
gfortran:f77: ./distance.f
creating build/lib.linux-x86_64-2.7
/usr/bin/gfortran -Wall -g -Wall -g -shared build/temp.linux-x86_64-2.7/build/src.linux-x86_64-2.7/distancemodule.o build/temp.linux-x86_64-2.7/build/src.linux-x86_64-2.7/fortranobject.o build/temp.linux-x86_64-2.7/distance.o -L/home/user/anaconda/lib -lpython2.7 -lgfortran -o build/lib.linux-x86_64-2.7/
To answer how to add the avx
flag into compiler options.
In your case the f77 complier is being picked gfortran:f77: ./distance.f
< That is the key line.
You could try specifying --f77flags=-mavx