Search code examples
rregressionphylogeny

R - Overcoming "Undefined Columns Selected"


EDIT: Now answered, find the updated code below:

I am using R to iterate over a set of phylogenetic trees onto various data sets with different numbers of predictors. As a result, I can not explicitly define my regression function (still working on a proper implementation).

Specifically, I am using the "caper" and "picante" packages and my annotated code is below (unfinished). Would someone please offer advice toward what I'm doing wrong? I just started learning R today but have familiarity with java/python/windows batch etc.

crunchMod <- crunch(names(kap)[2] ~ ., data=contrasts)

My error occurs on the above line of code.

library(picante)
library(gsubfn)
library(caper)
library(ape)

#This function works off the ordering of files on windows
#because I add .treLog.tre to the file name so 1&2 are the
#same tree and 3&4 are the same tree.

setwd("C:/mywork2/MED/Male_Only99")

Dat.analysis <- function() {
  treList <- dir(pattern="*.tre")
  caperDS <- read.table("dataSet.txt", header = TRUE)
  picanDS <- read.table("dataSet.txt", row.names = 1, header = TRUE)

  for (i in 1:length(names(picanDS))) {
    varName <- gsub("_|[0-9]|\\.", "", names(picanDS)[i])
    names(caperDS)[i+1] <- varName
    names(picanDS)[i]   <- varName
  }

  for (i in 1:1) { #length(treList)
    myTrees = read.nexus(treList[i])
    for (j in 1:1) { #length(myTrees)
    cat(paste("\n\n", treList[i]))
    print(multiPhylosignal(picanDS, myTrees[[j]]))

    contrasts <- comparative.data(myTrees[[i]], caperDS, Species)
    if (caperDS[3] == "MEDF" || caperDS[3] == "MAXF") {
      f <- as.formula(parse(paste(names(caperDS)[2],"~"), paste(names(caperDS)[4:ncol(caperDS)], collapse="+")))
      crunchMod <- crunch(f, data = contrasts)
      print(summary(crunchMod))

      f <- as.formula(paste(paste(names(caperDS)[3],"~"), paste(names(caperDS)[4:ncol(caperDS)], collapse="+")))
      crunchMod <- crunch(f, data = contrasts)
      print(summary(crunchMod))
    } else {
      f <- as.formula(paste(paste(names(caperDS)[2],"~"), paste(names(caperDS)[4:ncol(caperDS)], collapse="+")))
      crunchMod <- crunch(f, data = contrasts)
      print(summary(crunchMod))
    }
    }
  }
}

folders <- c("C:/mywork2/MAX","C:/mywork2/MED")
for (i in 1:length(folders)) {
  paths <- list.dirs(path = folders[i], full.names = TRUE, recursive = TRUE)
  for (j in 1:length(paths)) {
    if (!(paths[j] == folders[i])) {
      setwd(paths[j])
      contrasts <- Dat.analysis()
    }
  } 
}

print("finished")

Solution

  • Following the advice of jraab, I currently have my crunchMod variable coded as:

    f <- as.formula(paste(paste(names(caperDS)[2],"~"), paste(names(caperDS)[4:ncol(caperDS)], collapse="+")))
    crunchMod <- crunch(f, data = contrasts)
    print(summary(crunchMod))
    

    And it works beautifully. Thanks a ton, and I'll update my code above even though it's far from complete.