APD<- read.csv("APD.csv",header=FALSE)
rar<- read.csv("rar.csv",header=FALSE)
## Making column name labels
tree<-rep("tree",100)
tree_labels<-c(1:100)
colnames(APD)<-c(paste(tree,tree_labels, sep=""))
colnames(rar)<-c(paste(tree,tree_labels, sep=""))
correlation.csv<- cor(x=APD, y=rar, method = "spearman")
The above script suppose two calculated correlation between columns of two data sets . But there are two problem it starts labeling the output from the first column (which is the row name) so the last score gets NA as label. I'm not sure if I'm thinking correctly or not but maybe for the same reason R thinks APD is a data frame so does not calculate the last line.
Cheers
subset of csv file
V1 V2 V3 V4
t1 9.368703877 9.693286792 12.44129352 13.06908296
t10 8.128921254 8.940227268 11.40226603 12.17704779
t11 7.87062995 8.697508965 11.39250803 12.17704779
After read it in it's like below
V2 V3 V4 V5
V1 V2 V3 V4
t1 9.368703877 9.693286792 12.44129352 13.06908296
t1 08.128921254 8.940227268 11.40226603 12.17704779
t11 7.87062995 8.697508965 11.39250803 12.17704779
For the first problem, you can use row.names
:
APD <- read.csv("APD.csv", header = FALSE, row.names = 1)
rar <- read.csv("rar.csv", header = FALSE, row.names = 1)
From the documentation:
row.names
: a vector of row names. This can be a vector giving the actual row names, or a single number giving the column of the table which contains the row names, or character string giving the name of the table column containing the row names.
For the second problem, you can use mapply
:
mapply(cor, APD, rar, MoreArgs = list(method = "spearman"))
assuming that you want the correlation between the 1st column of each table, then the 2nd column, and so on.
Using your example:
> str(a)
'data.frame': 3 obs. of 4 variables:
$ V1: num 9.37 8.13 7.87
$ V2: num 9.69 8.94 8.7
$ V3: num 12.4 11.4 11.4
$ V4: num 13.1 12.2 12.2
> mapply(cor, a, -a, MoreArgs = list(method = "spearman"))
V1 V2 V3 V4
-1 -1 -1 -1