Automate several calculations in R through data frames

I have a series of vectors, each of them named as a stock, like FB for Facebook Inc. So I have over 70 series of vectors inside a data frames, for example, GEEK, IPAS, JCON etc. Over each pair of stocks, say for example, GEEK and JCON, I have to calculate a measure, called Mutual Information. I have done some code to find that measure over a pair of stocks, and it's like that.

To find entropyz (the entropy of X, Y, say the bivariate entropy of GEEK and JCON returns)

denz<-kde2d(x,y, n=512, lims=c(xlim,ylim))
z<-denz$z
cell_sizez<-(diff(xlim)/512) * (diff(ylim)/512)
normz<-sum(z)*cell_sizez
integrandz<-z*log(z)
entropyz<-sum(integrandz)*cell_sizez
entropyz<-entropyz/normz

To find entropyx (the entropy of X, say GEEK returns)

denx<-kde(x=x,gridsize = 512, xmin=xlim[1], xmax = xlim[2])
zx<-denx$estimate
cell_sizex<-(diff(xlim)/512) 
normx<-sum(zx)*cell_sizex
integrandx<-zx*log(zx)
entropyx<-sum(integrandx)*cell_sizex
entropyx<-entropyx/normx

To find entropyy (entropy of Y, say JCON returns)

deny<-kde(x=y,gridsize = 512, xmin=ylim[1], xmax = ylim[2])
zy<-deny$estimate
cell_sizey<-(diff(ylim)/512) 
normy<-sum(zy)*cell_sizey
integrandy<-zy*log(zy)
entropyy<-sum(integrandy)*cell_sizey
entropyy<-entropyy/normy

Finally, to find the mutual information of GEEK and JCON

MI <- entropyx+entropyy-entropyz

So, i have found the mutual information for X and Y (the two stocks above). But I have to calculate this measure for over 70 stocks (vectors), with 70 * 69 / 2 iteractions = 2415; It is like to make a correlation matrix, because it is pairwise comparison. The question is if one knows a way to make R find that mutual information for all pairs (x,y) in my dataset. So, in other words, to iterate this code for every pair over the dataframe, thus creating a pairwise matrix.

Thanks a lot!

Solution

If you create a function MI that takes in your two vectors of data and returns the value you could use something like the following to generate a symmetric square matrix with the results in. If we assume your data is in a data frame df we could do

MI = function(x,y,xlim,ylim){
  denz<-kde2d(x,y, n=512, lims=c(xlim,ylim))
  z<-denz$z
  cell_sizez<-(diff(xlim)/512) * (diff(ylim)/512)
  normz<-sum(z)*cell_sizez
  integrandz<-z*log(z)
  entropyz<-sum(integrandz)*cell_sizez
  entropyz<-entropyz/normz

  denx<-kde(x=x,gridsize = 512, xmin=xlim[1], xmax = xlim[2])
  zx<-denx$estimate
  cell_sizex<-(diff(xlim)/512) 
  normx<-sum(zx)*cell_sizex
  integrandx<-zx*log(zx)
  entropyx<-sum(integrandx)*cell_sizex
  entropyx<-entropyx/normx

  deny<-kde(x=y,gridsize = 512, xmin=ylim[1], xmax = ylim[2])
  zy<-deny$estimate
  cell_sizey<-(diff(ylim)/512) 
  normy<-sum(zy)*cell_sizey
  integrandy<-zy*log(zy)
  entropyy<-sum(integrandy)*cell_sizey
  entropyy<-entropyy/normy

  return(entropyx+entropyy-entropyz)
}
df = data.frame(1:10,1:10,1:10,1:10,1:10)
matrix(
  apply(
    expand.grid(
      seq_along(df),seq_along(df)),1,
    FUN = function(i,j) MI(df[,i],df[,j],xlim,ylim)
    ),
  nrow = ncol(df)
)

this works because expand.grid gives you all the combinations of column indicies in a n^2 by 2 data frame. We then apply the MI function to each of those and store the result in a matrix.

Edit: Edited to make more clear