I'm trying to count the number of occurence of each "scenarios" that I have (0 to 9) in a data frame over 25 years. Basically, I have 10000 simulations of scenarios named 0 to 9, each scenario having a probability of occurence.
My dataframe is too big to paste in here but here's a preview:
simulation=as.data.frame(replicate(10000,sample(c(0:9),size=25,replace=TRUE,prob=prob)))
simulation2=transpose(simulation)
Note** prob is a vector with the probability to observe each scenario
v1 v2 v3 v4 v5 v6 ... v25
1 0 0 4 0 2 0 9
2 1 0 0 2 3 0 6
3 0 4 6 2 0 0 0
4
...
10000
This is what I have tried so far:
for (i in c(1:25)){
for (j in c(0:9)){
f=sum(simulation2[,i]==j);
vect_f=c(vect_f,f)
}
vect_f=as.data.frame(vect_f)
}
If I omit the "for (i in c(1:25))", this returns me the right first column of the output desired. Now I am trying to replicate this over 25 years. When I put the second 'for' I do not get the output desired.
The output should look like this :
(Year) 1 2 3 4 5 6 ... 25
(Scenario)
0 649
1 239
...
9 11
649 being the number of times 'scenario 0' is observed the first year over my 10 000 simulations.
Thanks for your help
We can use table
sapply(simulation2, table)
# V1 V2 V3 V4 V5 .....
#0 1023 1050 994 1016 1022 .....
#1 1050 968 950 1001 981 .....
#2 997 969 1004 999 949 .....
#3 1031 977 1001 993 1009 .....
#4 1017 1054 1020 1003 985 .....
#......
If there are certain values missing in a column we can convert the numbers to factor including all levels
sapply(simulation2, function(x) table(factor(x, levels = 0:9)))