Say I have three random variables. I would like to do a convolution to obtain the average. How do I do this in Python and or R?
Also. It seems the default behavior is to have the convolution size larger than any of the inputs. I will assume that all of the inputs are the same size. Is it possible to have the resulting convolution the same size as the vectors which are being used as inputs to the convolution?
For example, if x1
is n=100
then I would like the resulting convolution to be n=100
I theory the convolution should be close to what I can calculate analytically.
import numpy as np
rng = np.random.default_rng(42)
n, u1, u2, u3, sd = 100, 10, 20, 6, 5
u_avg = np.mean([u1,u2,u3])
a = rng.normal(u1, sd, size=n)
b = rng.normal(u2, sd, size=n)
c = rng.normal(u3, sd, size=n)
z = rng.normal(u_avg, sd/np.sqrt(3), size=n)
convolution = rng.choice(reduce(np.convolve, [a, b, c]), size=n)
print("true distribution")
print(np.round(np.quantile(z, [0.01, 0.25, 0.5, 0.75, 0.99]), 2))
print("convolution")
print(np.round(np.quantile(convolution, [0.01, 0.25, 0.5, 0.75, 0.99]),2))
If the convolution is working then the convolution
should be close to the true
distribution.
true distribution
[ 3.9 9.84 12.83 14.89 18.45]
convolution
[5.73630000e+03 5.47855750e+05 2.15576037e+06 6.67763665e+06
8.43843281e+06]
It looks like the convolution is not even close.
I think you misused the "convolution" to calculate the PDF of summation of independent normally distributed random variables. It should be noted that, the convolution is applied to the PDFs, rather than the random variables.
Below is an example code that may help you to find the result you want. Given the initial random variables u1
, u2
, u3
like below
set.seed(1)
n <- 100
u1 <- 10
u2 <- 20
u3 <- 6
SD <- 5
u <- mean(c(u1, u2, u3))
x1 <- rnorm(n, u1, SD)
x2 <- rnorm(n, u2, SD)
x3 <- rnorm(n, u3, SD)
z <- rnorm(n, u, SD / sqrt(3)) # the random variable that is generated from the objective (desired) distribution
and you should use the following operation to construct the objective random variable x
, if you want to work on the random variables x1
, x2
and x3
:
x <- rowMeans(cbind(x1, x2, x3))
and you will see that
> mean(x)
[1] 12.16792
> sd(x)
[1] 2.752923
while the "desired" statistics are
> u
[1] 12
> SD / sqrt(3)
[1] 2.886751
You can try conv
from package pracma
> library(pracma)
> x <- c(1, 2)
> y <- c(2, 3, 4)
> z <- c(-1, 0, 1, 5)
> Reduce(conv, list(x, y, z))
[1] -2 -7 -8 9 45 58 40