I have a set of (x,y,z,u,v,w) vectors of N objects.
What I am trying to do is expand this data set by cloning these objects with Monte Carlo simulation.
I was wondering if this is reasonable. If so, how can a do this in Python and if not, what the alternative is.
+)What I am used to is having uncertainty values for the vector components and doing some sort of a multivariate distribution extraction to basically get "clone data", i.e. data that represents the uncertainty. In this case, I do not have uncertainty, so I am trying to get a synthetic distribution.
To be able to create new data with MC simulation you need to have something to simulate from. I guess that it is not that clear from your question how this simulation should take place. If you mean that you should create new data that are other permutations of the existing data I guess that you could call that MC in some sense. That could be achieved by choosing one element from each vector at random. Code example (not tested nor optimized, but conceptually working):
import numpy as np
data = ...
n_new_data
new_data = np.full((n_new_data, len(data)), np.nan)
for i, vec in enumerate(data):
for j in range(n_new_data):
new_data[j, i] = vec[np.random.random_integers(len(vec)-1)]
If you only have data
to work with this could be somewhat reasonable to do. The other, more complex but more realistic if the variables are not independent, option is to calculate the correlation between the variables in each sample and then generate new data according to this correlation.