Estimating mixture of Gaussian models in Pytorch

I actually want to estimate a normalizing flow with a mixture of gaussians as the base distribution, so I'm sort of stuck with torch. However you can reproduce my error in my code by just estimating a mixture of Gaussian model in torch. My code is below:

import numpy as np
import matplotlib.pyplot as plt
import sklearn.datasets as datasets

import torch
from torch import nn
from torch import optim
import torch.distributions as D

num_layers = 8
weights = torch.ones(8,requires_grad=True).to(device)
means = torch.tensor(np.random.randn(8,2),requires_grad=True).to(device)#torch.randn(8,2,requires_grad=True).to(device)
stdevs = torch.tensor(np.abs(np.random.randn(8,2)),requires_grad=True).to(device)
mix = D.Categorical(weights)
comp = D.Independent(D.Normal(means,stdevs), 1)
gmm = D.MixtureSameFamily(mix, comp)

num_iter = 10001#30001
num_iter2 = 200001
loss_max1 = 100
for i in range(num_iter):
    x = torch.randn(5000,2)#this can be an arbitrary x samples
    loss2 = -gmm.log_prob(x).mean()#-densityflow.log_prob(inputs=x).mean()
    optimizer1.zero_grad()
    loss2.backward()
    optimizer1.step()

The error I get is:

0
8.089411823514835
Traceback (most recent call last):

  File "/home/cameron/AnacondaProjects/gmm.py", line 183, in <module>
    loss2.backward()

  File "/home/cameron/anaconda3/envs/torch/lib/python3.7/site-packages/torch/tensor.py", line 221, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph)

  File "/home/cameron/anaconda3/envs/torch/lib/python3.7/site-packages/torch/autograd/__init__.py", line 132, in backward
    allow_unreachable=True)  # allow_unreachable flag

RuntimeError: Trying to backward through the graph a second time, but the saved intermediate results have already been freed. Specify retain_graph=True when calling backward the first time.

After as you see the model runs for 1 iteration.

Solution

There is ordering problem in your code, since you create Gaussian mixture model outside of training loop, then when calculate the loss the Gaussian mixture model will try to use the initial value of the parameters that you set when you define the model, but the optimizer1.step() already modify that value so even you set loss2.backward(retain_graph=True) there will still be the error: RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation

Solution to this problem is simply create new Gaussian mixture model whenever you update the parameters, example code running as expected:

import numpy as np
import matplotlib.pyplot as plt
import sklearn.datasets as datasets

import torch
from torch import nn
from torch import optim
import torch.distributions as D

num_layers = 8
weights = torch.ones(8,requires_grad=True)
means = torch.tensor(np.random.randn(8,2),requires_grad=True)
stdevs = torch.tensor(np.abs(np.random.randn(8,2)),requires_grad=True)

parameters = [weights, means, stdevs]
optimizer1 = optim.SGD(parameters, lr=0.001, momentum=0.9)

num_iter = 10001
for i in range(num_iter):
    mix = D.Categorical(weights)
    comp = D.Independent(D.Normal(means,stdevs), 1)
    gmm = D.MixtureSameFamily(mix, comp)

    optimizer1.zero_grad()
    x = torch.randn(5000,2)#this can be an arbitrary x samples
    loss2 = -gmm.log_prob(x).mean()#-densityflow.log_prob(inputs=x).mean()
    loss2.backward()
    optimizer1.step()

    print(i, loss2)