Search code examples
matlabvectorrandomeuclidean-distanceuniform-distribution

How to generate random uniformly distributed vectors of euclidian length of one?


I am trying to randomly generate uniformly distributed vectors, which are of Euclidian length of 1. By uniformly distributed I mean that each entry (coordinate) of the vectors is uniformly distributed.

More specifically, I would like to create a set of, say, 1000 vectors (lets call them V_i, with i=1,…,1000), where each of these random vectors has unit Euclidian length and the same dimension V_i=(v_1i,…,v_ni)' (let’s say n = 5, but the algorithm should work with any dimension). If we then look on the distribution of e.g. v_1i, the first element of each V_i, then I would like that this is uniformly distributed.

In the attached MATLAB example you see that you cannot simply draw random vectors from a uniform distribution and then normalize the vectors to Euclidian length of 1, as the distribution of the elements across the vectors is then no longer uniform.

Is there a way to generate this set of vectors such, that the distribution of the single elements across the vector-set is uniform?

Thank you for any ideas.

PS: MATLAB is our Language of choice, but solutions in any languages are, of course, welcome.

clear all
   rng('default')
   
   nvar=5;
   sample = 1000;
   
   x = zeros(nvar,sample);
   
   for ii = 1:sample
       
       y=rand(nvar,1);  
       x(:,ii) = y./norm(y);
       
   end
   
   hist(x(1,:))
   figure
   hist(x(2,:))
   figure

   hist(x(3,:))
   figure
   hist(x(4,:))
   figure
   hist(x(5,:))

Solution

  • What you want cannot be accomplished.

    Vectors with a length of 1 sit on a circle (or sphere or hypersphere depending on the number of dimensions). Let's focus on the 2D case, if it cannot be done there, it will be clear that it cannot be done with more dimensions either.

    Because the points are on a circle, their x and y coordinates are dependent, the one can be computed based on the other. Thus, the distributions of x and y coordinates cannot be defined independently. We can define the distribution of the one, generate random values for it, but the other coordinate must be computed from the first.

    Let's make points on a half circle with a uniform x coordinate (can be extended to a full circle by adding a random sign to the y coordinate):

    N = 1000;
    x = 2 * rand(N,1) - 1;
    y = sqrt(1 - x.^2);
    plot(x,y,'.')
    axis equal
    histogram(y)
    

    The plot generates shows a clearly non-uniform distribution, with many more samples generated near y=1 than near y=0. If we add a random sign to the y-coordinate we'd have more samples near y=1 and y=-1 than near y=0.