Search code examples
rloopsfor-loopdatatablereturn

For loop only running through the last iteration when creating new contingency table


My for loop I've created calculates the expected value from observed values and stores it in a new contingency table (a duplicate I made earlier). To calculate expected, you multiple the row sum and the col sum, divide by the total.

I've created a for loop nested in another for loop that goes through the observed contingency table and calculates the expected value, then stores it in the new expected table, however, when running the code, it only computes the last iteration or from the data[3,3].

The observed table w added margins:
              Frequently Never Rarely  Sum
  Conservative         15   214     47  276
  Liberal             119   479    173  771
  Other                85   172     45  302
  Sum                 219   865    265 1349
The expected table:

               Frequently Never Rarely
  Conservative         15   214     47
  Liberal             119   479    173
  Other                85   172     45

viewsandpot is the data I've named I've read as a file already (so it is a table).

expecteddata <- function(rawdata){
  observedtable <- table(factor(rawdata[,2]), factor(rawdata[,1]))
  observedtable <- addmargins(observedtable)
  expectedtable <- observedtable
  i <- 1
  j <- 1
  ncol <- ncol(observedtable)
  nrow <- nrow(observedtable)
  for(i in nrow-1){
    j <- 1
    for(j in ncol-1){
      expectedtable[i,j] <- (observedtable[i, ncol]*observedtable[nrow, j])/observedtable[ncol, nrow]
      j <- j+1
    }
  }
  return(expectedtable)
}
expecteddata(viewsandpot)

the expected values contingency table should look like the observed count but replaced with calculated values (numbers should be different).

Only the last iteration works - result I get from the code is :

             Frequently     Never    Rarely
  Conservative   15.00000 214.00000  47.00000
  Liberal       119.00000 479.00000 173.00000
  Other          85.00000 172.00000  59.32543

So 59.325 is the only different number.

Not sure why the loops don't work, considering the inner for loop first replaces the entire first row, then goes to the next row.


Solution

  • I think, I finally got it, hope this is your desired solution:

    Frequently <- c(15, 119, 85) #a vector 
    Never <- c(214, 479, 172)
    Rarely <- c(47, 173, 45)
    #setting the observedtable to use later in the function as a data frame
    data <- data.frame(Frequently, Never, Rarely, row.names = c("Conservative", "liberal", "other"))
    
    expecteddata <- function(rawdata) {
      #make table to use with the dataframes first, second and third column
      observedtable <-matrix(data = c(rawdata[,1], rawdata[,2], rawdata[,3]), ncol=3)
      #make sum of rows and columns
      observedtable <- addmargins(observedtable)
      #make a dummy expectedtable with values from 1 to 9
      expectedtable <- matrix(1:9, ncol = 3)
      #sets the names of the columns and rows:
      colnames(expectedtable) <- c("Frequently", "Never", "Rarely")
      rownames(expectedtable) <- c("Conservative", "Liberal", "Other")
    
      ncol <- ncol(observedtable)
      nrow <- nrow(observedtable)            
      total <- observedtable[nrow, ncol]
      for (i in 1:(nrow - 1)) { #what you did was a for each loop of one item here its in the range of 1 to nrow-1 (range is always in r from:to)
        for (j in 1:(ncol - 1)) { #you dont have to set j for every outer loop =1 does it automatically
          rowSum <- observedtable[i, ncol]
          colSum <- observedtable[nrow, j]
          expectedtable[i, j] <- (rowSum * colSum) / total
        }
      }
      return(expectedtable)
    }
    print(expecteddata(data))
    

    This is the output:

                 Frequently    Never    Rarely
    Conservative   44.80652 176.9755  54.21794
    Liberal       125.16605 494.3773 151.45663
    Other          49.02743 193.6471  59.32543