I'm following the vegan tutorial on procrustes analysis:
https://www.rdocumentation.org/packages/vegan/versions/2.4-2/topics/procrustes
# Load data
library(vegan)
data(varespec)
# Ordinations
vare.dist <- vegdist(wisconsin(varespec))
mds.null <- monoMDS(vare.dist, y = cmdscale(vare.dist))
mds.alt <- monoMDS(vare.dist)
# Run procrustes
vare.proc <- procrustes(mds.alt, mds.null)
# Summary of procrustes
summary(vare.proc)
Call:
procrustes(X = mds.alt, Y = mds.null)
Number of objects: 24 Number of dimensions: 2
Procrustes sum of squares:
13.99951
Procrustes root mean squared error:
0.7637492
Quantiles of Procrustes errors:
Min 1Q Median 3Q Max
0.08408327 0.23038165 0.49316643 0.65854198 1.99105837
Rotation matrix:
[,1] [,2]
[1,] 0.9761893 0.2169202
[2,] 0.2169202 -0.9761893
of averages:
[,1] [,2]
[1,] -2.677059e-17 -5.251452e-18
Scaling of target:
[1] 0.6455131
plot(vare.proc)
Now with Python I can choose either:
# Returns
mtx1: array_like
A standardized version of data1.
mtx2: array_like
The orientation of data2 that best fits data1. Centered, but not necessarily .
disparity: float
as defined above.
R(N, N): ndarray
The matrix solution of the orthogonal Procrustes problem. Minimizes the Frobenius norm of (A @ R) - B, subject to R.T @ R = I.
scale: float
Sum of the singular values of A.T @ B.
My questions:
vegan::procrustes
, scipy.spatial.procrustes
or scipy.linalg.orthogonal_procrustes
?Using @Kat's answer for the plotting and my additional answer for the SciPy/NumPy implementation, one should be able to reproduce the VEGAN analysis in Python exactly using only standard packages from PyData.
Just in case it helps anyone, I've implemented both Procrustes
and Protest
(along with the plotting method described here) from vegan
in my soothsayer
python package:
https://github.com/jolespin/soothsayer/blob/60accb669062c559ba785be201deddeede26a049/soothsayer/ordination/ordination.py#L715
The manner in which the vegan
package runs through monoMDS
doesn't seem to align with any of the methods that Python uses. The code in vegan
is all based on Fortran. You could link the Fortan to Python, just like you can in R.
While the scaling is different, the rotation matrix within vare.proc
is exactly the same as the rotation matrix (array?) in the Python coded object above vare_proc
. I found that to be interesting.
If I bring in the pieces of the content in R for your object vare.proc
, I can make the exact graph.
First, I used R Markdown to share objects between Python and R.
# compare to R
# vpR = vareProc$rotation
vpR_py = r.vpR
# vpYrot = vareProc$Yrot
vpYrot_py = r.vpYrot
# vpX = vareProc$X
vpX_py = r.vpX
Then I used these objects to make the replica
# build the plot exactly like vegan
# using https://github.com/vegandevs/vegan/blob/master/R/plot.procrustes.R
tails = vpYrot_py
heads = vpX_py
# find the ranges
xMax = max(abs(np.hstack((tails[:,0], heads[:,0]))))
yMax = max(abs(np.hstack((tails[:,1], heads[:,1]))))
plt.figure(dpi = 100)
fig, ax = plt.subplots()
ax.set_aspect('equal')
ax.set_xlim(-xMax, xMax)
ax.set_ylim(-yMax, yMax)
# add points
ax.scatter(x = tails[:,0],
y = tails[:,1],
facecolors = 'none',
edgecolors = 'k')
# add origin axes
ax.axhline(y = 0, color='k', ls = '--', linewidth = 1) # using dashed for origin
ax.axvline(x = 0, color='k', ls = '--', linewidth = 1)
# add rotation axes
ax.axline(xy1 = (0, 0), color = 'k',
slope = vpR_py[0, 1]/vpR_py[0, 0])
ax.axline(xy1 = (0, 0), color = 'k',
slope = vpR_py[1, 1]/vpR_py[1, 0])
# add arrows
for i in range(0, len(tails)):
ax.annotate("", xy = heads[i,:],
xycoords = 'data',
xytext = tails[i,:],
arrowprops=dict(arrowstyle="->",
color = 'blue'))
plt.tight_layout(pad = 2)
plt.show()
matplotlib
vegan