I'm building a GRU model for stock price forecasting. I wanted to integrate a stochastic process to the model to model the price volatility.
So, this is the class to produce a stochastic path.
class SDE(nn.Module):
def __init__(self,lam,sigma):
super().__init__()
self.lam = lam
self.sigma = sigma
def forward(self, T, steps, Npaths):
np.random.seed(4)
lam = self.lam.detach().numpy()
sigma= self.sigma.detach().numpy()
.....
return sigma * lam * xx
Now, my model is :
class MyModel(nn.Module):
def __init__(self, args):
super(MyModel, self).__init__()
self.lam = nn.Parameter(torch.tensor(1.0), requires_grad=True)
self.sigma = nn.Parameter(torch.tensor(0.2), requires_grad=True)
# GRU layers
self.gru = nn.GRU(
self.input_dim, self.hidden_dim, self.layer_dim, batch_first=True,
dropout=args.dropout, bidirectional=True)
# SDE
levy = SDE(self.lam, self.sigma)
# Fully connected layer
self.fc = nn.Linear(self.hidden_dim * 2, self.output_dim)
def forward(self, x):
lev = torch.from_numpy(levy(1.0, 16, 1))
.....
h0 = torch.zeros(self.args['num_layers']* 2, x.size(0), self.args['n_hidden_units'],
device=x.device).requires_grad_()
out, _ = self.gru(x, h0.detach())
out = out[:, -1, :]
out = self.fc(out)
out_m = torch.mul(out, lev)
return out
The train will besomething like this:
# Makes predictions
yhat = self.model(x)
# Computes loss
loss = self.loss_fn(y, yhat)
# Computes gradients
#loss.requires_grad = True
loss.backward()
# Updates parameters and zeroes gradients
self.optimizer.step()
self.optimizer.zero_grad()
By training this network, Should this code calibrate and identify optimal values for the sigma and lam parameters used for generating the stochastic path in SDE?
I can see from the debug that their values are always the same.
Any tips, please, to make this code useful for my objective, which is to calibrate the sigma and lam?
If you want lam
and sigma
to be learned, you need to implement them as pytorch Parameters
and compute the results of lam
and sigma
using pytorch methods. When you call .detach().numpy()
on values, you remove them from the computational graph, which means pytorch can't update them via backprop.
From your code, you define lam
and sigma
as Parameters
in your MyModel
class, so leave that be. For the SDE
element, you can replace the class with a function (the SDE
class is just holding variables that are already stored in MyModel
). The SDE
forward method isn't shown, but you need to have that implemented with pytorch methods rather than numpy.
You should also consider what you hope to accomplish with the SDE
method. It looks like SDE
produces a sequence of values based on lam
and sigma
that are used to scale the outputs of your GRU. Your model might decide it's easier to avoid the SDE
params by setting them to lam=1, sigma=0
or other uninformative parameters.