I have a data.frame
which I want to generate random numbers each list by a sequence.
I used sample
function to create random numbers but even I created random numbers for list [[1]]
, for set [[2]]
same numbers produced again. So, here how can I create different random numbers for the set [[2]]
.
here is the simple code;
data.list <- lapply(1:2, function(x) {
nrep <- 1
time <- rep(seq(90,54000,by=90),times=nrep)
Mx <- rep(sort(sample(seq(0.012,-0.014,length.out = 600),replace=TRUE)), times=nrep)
My <- rep(sort(sample(seq(0.02,-0.02,length.out = 600),replace=TRUE)), times=nrep)
Mz <- rep(sort(sample(seq(-1,1,length.out=600),replace=TRUE)), times=nrep)
data.frame(time,Mx,My,Mz,set_nbr=x)
})
this is provide the 5 first lines of each of datasets
[[1]]
time Mx My Mz set_nbr
1 90 -1.391319e-02 -2.000000e-02 -1.000000000 1
2 180 -1.386978e-02 -1.986644e-02 -1.000000000 1
3 270 -1.386978e-02 -1.973289e-02 -0.996661102 1
4 360 -1.382638e-02 -1.973289e-02 -0.993322204 1
5 450 -1.382638e-02 -1.973289e-02 -0.979966611 1
.. .. .... .... .... ...
[[2]]
time Mx My Mz set_nbr
1 90 -1.395659e-02 -0.0200000000 -1.000000000 2
2 180 -1.391319e-02 -0.0199332220 -0.993322204 2
3 270 -1.386978e-02 -0.0199332220 -0.993322204 2
4 360 -1.386978e-02 -0.0199332220 -0.993322204 2
5 450 -1.382638e-02 -0.0199332220 -0.986644407 2
.. .. .... .... .... ...
EDIT 1:
regarding to @bgoldst answer now I can produce different numbers
set.seed(1);
data.list <- lapply(1:2, function(x) {
nrep <- 1;
time <- rep(seq(90,54000,by=90),times=nrep);
Mx <- rep(sort(runif(600,-0.014,0.012)),times=nrep);
My <- rep(sort(runif(600,-0.02,0.02)),times=nrep);
Mz <- rep(sort(runif(600,-1,1)),times=nrep);
data.frame(time,Mx,My,Mz,set_nbr=x);
});
On the other hand when I change nrep <- 3;
same numbers are created for each nrep
. This is the thing I want to avoid from the beginning.
EDIT 2:
@bgoldst showed that replicate
does the job!
I think you may have some confusion about how sample()
works.
First, let's examine sample()
's behavior with respect to this simple vector:
1:5;
## [1] 1 2 3 4 5
When you pass a multi-element vector to sample()
it basically just randomizes the order. This means you'll get a different result every time, or rather, to state it more precisely, the longer the vector is, the less likely you are to get the same result twice:
set.seed(1); sample(1:5); sample(1:5); sample(1:5);
## [1] 2 5 4 3 1
## [1] 5 4 2 3 1
## [1] 2 1 3 4 5
This means if you sort it immediately after sampling, then you'll get the same result every time. And if the original vector was itself sorted, then the result will also be equal to that original vector. This will be true regardless how sample()
randomized the order, because the order is always restored by sort()
:
set.seed(1); sort(sample(1:5)); sort(sample(1:5)); sort(sample(1:5));
## [1] 1 2 3 4 5
## [1] 1 2 3 4 5
## [1] 1 2 3 4 5
Now if you add replace=T
(or just rep=T
if you like to take advantage of partial matching for concision, which I do), then you're not just randomizing the order, you're selecting size
elements with replacement, where size
is the vector length if you didn't provide size
explicitly. This means you can get repeated elements in the result:
set.seed(1); sample(1:5,rep=T); sample(1:5,rep=T); sample(1:5,rep=T);
## [1] 2 2 3 5 2
## [1] 5 5 4 4 1
## [1] 2 1 4 2 4
And so, if you sort the result, you (likely) won't get back the original vector, because some elements will have been repeated, and some elements will have been omitted:
set.seed(1); sort(sample(1:5,rep=T)); sort(sample(1:5,rep=T)); sort(sample(1:5,rep=T));
## [1] 2 2 2 3 5
## [1] 1 4 4 5 5
## [1] 1 2 2 4 4
That's exactly what is happening with your code. Your output vectors are different between the two list components, because you're sampling with replacement before sorting, which means different repetitions and omissions of the elements will occur for each list component. But since you're sampling from the same sequence and you're sorting the result, you're bound to get similar-looking results for each list component, even though they're not identical.
I think what you might be looking for is random deviates from a uniform distribution. You can get these from runif()
:
set.seed(1); runif(5,-0.014,0.012);
## [1] -0.0070967748 -0.0043247786 0.0008941874 0.0096134025 -0.0087562698
set.seed(1); runif(5,-0.02,0.02);
## [1] -0.009379653 -0.005115044 0.002914135 0.016328312 -0.011932723
set.seed(1); runif(5,-1,1);
## [1] -0.4689827 -0.2557522 0.1457067 0.8164156 -0.5966361
Thus, your code would become:
set.seed(1);
data.list <- lapply(1:2, function(x) {
nrep <- 1;
time <- rep(seq(90,54000,by=90),times=nrep);
Mx <- rep(sort(runif(600,-0.014,0.012)),times=nrep);
My <- rep(sort(runif(600,-0.02,0.02)),times=nrep);
Mz <- rep(sort(runif(600,-1,1)),times=nrep);
data.frame(time,Mx,My,Mz,set_nbr=x);
});
Which gives:
lapply(data.list,head);
## [[1]]
## time Mx My Mz set_nbr
## 1 90 -0.01395224 -0.01994741 -0.9967155 1
## 2 180 -0.01394975 -0.01991923 -0.9933909 1
## 3 270 -0.01378866 -0.01980934 -0.9905714 1
## 4 360 -0.01371306 -0.01977090 -0.9854065 1
## 5 450 -0.01371011 -0.01961713 -0.9850108 1
## 6 540 -0.01365998 -0.01960718 -0.9846628 1
##
## [[2]]
## time Mx My Mz set_nbr
## 1 90 -0.01398426 -0.01997718 -0.9970438 2
## 2 180 -0.01398293 -0.01989651 -0.9931286 2
## 3 270 -0.01397330 -0.01988715 -0.9923425 2
## 4 360 -0.01396455 -0.01957807 -0.9913645 2
## 5 450 -0.01384501 -0.01939597 -0.9892001 2
## 6 540 -0.01382531 -0.01931913 -0.9889356 2
Edit: It looked from your question like you wanted the random numbers to be different between list components, that is to say, between the components generated from the 1:2 passed as the first argument to lapply()
. The repetition of each random vector nrep
times within each list component didn't appear to be relevant, partly because you set nrep
to 1, so there wasn't any actual repetition.
But that's ok, we can achieve this requirement by using replicate()
instead of rep()
, because replicate()
actual runs its expression argument once for every repetition. We also have to flatten the result, because replicate()
by default returns a matrix, and we want a straight vector:
set.seed(1);
data.list <- lapply(1:2, function(x) {
nrep <- 2;
time <- rep(seq(90,54000,by=90),times=nrep);
Mx <- c(replicate(nrep,sort(runif(600,-0.014,0.012))));
My <- c(replicate(nrep,sort(runif(600,-0.02,0.02))));
Mz <- c(replicate(nrep,sort(runif(600,-1,1))));
data.frame(time,Mx,My,Mz,set_nbr=x);
});
lapply(data.list,function(x) x[c(1:6,601:606),]);
## [[1]]
## time Mx My Mz set_nbr
## 1 90 -0.01395224 -0.01993431 -0.9988590 1
## 2 180 -0.01394975 -0.01986782 -0.9948254 1
## 3 270 -0.01378866 -0.01981143 -0.9943576 1
## 4 360 -0.01371306 -0.01970813 -0.9789037 1
## 5 450 -0.01371011 -0.01970022 -0.9697986 1
## 6 540 -0.01365998 -0.01969326 -0.9659567 1
## 601 90 -0.01396582 -0.01997579 -0.9970438 1
## 602 180 -0.01394750 -0.01997375 -0.9931286 1
## 603 270 -0.01387607 -0.01995893 -0.9923425 1
## 604 360 -0.01385108 -0.01994546 -0.9913645 1
## 605 450 -0.01375113 -0.01976155 -0.9892001 1
## 606 540 -0.01374467 -0.01973125 -0.9889356 1
##
## [[2]]
## time Mx My Mz set_nbr
## 1 90 -0.01396979 -0.01999198 -0.9960861 2
## 2 180 -0.01390373 -0.01995219 -0.9945237 2
## 3 270 -0.01390252 -0.01991559 -0.9925640 2
## 4 360 -0.01388905 -0.01978123 -0.9890171 2
## 5 450 -0.01386718 -0.01967644 -0.9835435 2
## 6 540 -0.01384351 -0.01958008 -0.9822988 2
## 601 90 -0.01396739 -0.01989328 -0.9971255 2
## 602 180 -0.01396433 -0.01985785 -0.9954987 2
## 603 270 -0.01390700 -0.01984074 -0.9903196 2
## 604 360 -0.01376890 -0.01982715 -0.9902251 2
## 605 450 -0.01366110 -0.01979802 -0.9829480 2
## 606 540 -0.01364868 -0.01977278 -0.9812671 2