Search code examples
rrandommontecarlo

Using rnorm for a dataframe


I wanted to use rnorm function over a dataframe having e.g nrow=11451 elements. I don't know how to write the code to apply rnorm for each row leading to a sim-dataframe with nsim columns and nrow rows.

dfsim <- rnorm (n=nsim, mean=df[[?]], sd=df[[?]])

As an example:

> head(df)
An object of class "SpatialLinesDataFrame"
Slot "data":
           LINEARID            FULLNAME RTTYP MTFCC          M01         SD01 Nsim
10969 1104486135650       US Hwy 90 Alt     U S1200 0.0009886878 0.0001253361   10
10970 1104486135651       US Hwy 90 Alt     U S1200 0.0009831224 0.0001442643   10
10416 1102965182224 Southwest Fwy E Acc     M S1640 0.0010000000 0.0000000000   10
10494 1103342335512   Robin Hood Ct Pvt     M S1780 0.0010000000 0.0000000000   10
10493 1103342334514 Little John Way Pvt     M S1750 0.0010000000 0.0000000000   10
1847  1101842210421      Arrowood Cir N     M S1400 0.0010000000 0.0000000000   10

My expected result is to have ten more columns for each row including simulated values.

I used the following code but got "invalid argument error"

> dfnorm <- apply(df@data, 1, function(x) rnorm(x["Nsim"], mean=x["M01"], sd=x["SD01"]))
 Error in rnorm(x["Nsim"], mean = x["M01"], sd = x["SD01"]) : 
  invalid arguments 

Since the dataframe is too large, I used the subset function to keep only three rows and save it into the .rdata file. Here is the link: df.rdata


Solution

  • In your dataframe you need to add a column with the sample size like such:

     dataFrameApply <- data.frame(sampleSize = c(100,100,100),               
                                meanNum = c(1,2,3), sdNum = c(1,2,3))
          sampleSize meanNum sdNum
    1        100       1     1
    2        100       2     2
    3        100       3     3
    

    Then use apply to go over each row. The second argument can be 1 or 2 depending on whether to apply over row or column.

    normalize <- apply(dataFrameApply, 1, function(x) rnorm(x[1], mean=x[2], sd=x[3]))
    

    This worked for me on my machine

    dfDataFrame  <- as.data.frame(df@data)
    dataFrameSub <- dfDataFrame[,c(7,5,6)]
    normalize    <- apply(dataFrameSub,    1, function(x) rnorm(x[1], mean=x[2], 
                    sd=x[3]))