Search code examples
rggplot2pcaveganmds

NMDS in ggplot creating dataframe with different lengths


-see end for screenshots of data for reference-

Hello i was following along this YouTube tutorialfor NMDS and everything was going well until I checked the description and it said with r updates colours/ points will not appear on the graph.

This was my original code :

data_1<-rivers_catchement[,2:27]
data_2<-rivers_catchement[1]

library("vegan")
NMDS<-metaMDS(data_1, distance= "bray", k = 2)

shape=c(18,16)
co=c("seagreen4", "olivedrab","seagreen2", "cornflowerblue", "slateblue2", "royalblue4", "brown2", "tomato1", "sienna3", "gold1", "darkgoldenrod1", "yellow2")
shape=c(18,16)
plot(NMDS$points, col=co[data_2$Habitat], pch = shape[data_2$Habitat], 
  cex=1.2, main="Macroinvertebrate Assembleges Across the Two Sites", xlab = "axis 1", ylab= "axis 2")

With this I was able to get a graph with the labels for the sites however, I could not get colours and points.

I tried his updated code linked in the description of the video:

library("vegan")
library("ggplot2")
datascores <- as.data.frame(scores(NMDS)) 
scores <- cbind(as.data.frame(datascores), Habitat = data_2$Habitat)
centroids <- aggregate(cbind(NMDS1, NMDS2) ~ Habitat, data = scores, FUN = mean)
seg <- merge(scores, setNames(centroids, c('Habitat','oNMDS1','oNMDS2')),
             by = 'Habitat', sort = FALSE)

ggplot(scores, aes(x = NMDS1, y = NMDS2, colour = Habitat)) +
  geom_segment(data = seg,
               mapping = aes(xend = oNMDS1, yend = oNMDS2)) + 
  geom_point(data = centroids, size = 4) +                    
  geom_point() +                                              
  coord_fixed()+                                              
  theme_bw()+ 
  theme(legend.position="right",legend.text=element_text(size=10),legend.direction='vertical')

However - I get this error message as I have 12 rows yet 26 columns:

Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, :
arguments imply differing number of rows: 12, 26

I do not think it would work to change the structure of the data as I need to call on data_2 to separate / colour code habitat I think.

I have attached a screenshot of my current set up of data. I a have also attached data_1 and data_2 screenshots. (see below)

Any help would be greatly appreciated, I tried other Youtube tutorials/ techniques but I feel this one I understood most and came the closest to graphing just need to fix the length issue for the data frame I think.

screenshots of data

(https://i.sstatic.net/CalDJ.jpg)

(https://i.sstatic.net/JwU8b.png)

Kind Regards, Rosie


Solution

  • You requested a list of two different kind of scores: scores for rows and scores for columns. This list cannot be converted to a data frame. If you want to get only one kind of scores, you should say so: scores(NMDS, display="site"). The error message probably came from the as.data.frame() command, but we would not know, because you did not tell us which command gave you the error message.