I am attempting to use Stochastic Gradient Descent but I am unsure as to why my error/loss is not decreasing. The information I am using from the train
dataframe is the index (each sequence) and the binding affinity, and the goal is to predict the binding affinity. Here is what the head of the dataframe looks like:
For the training, I make a one-hot of a sequence and calculate a score with another matrix, and the goal is to get this score to be as close to the binding affinity as possible (for any given peptide). How I calculate the score and my training loop is shown in my code below but I don't think an explanation is necessary to solve why my error fails to decrease.
#ONE-HOT ENCODING
AA=['A','R','N','D','C','Q','E','G','H','I','L','K','M','F','P','S','T','W','Y','V']
loc=['N','2','3','4','5','6','7','8','9','10','11','C']
aa = "ARNDCQEGHILKMFPSTWYV"
def p_one_hot(seq):
c2i = dict((c,i) for i,c in enumerate(aa))
int_encoded = [c2i[char] for char in seq]
onehot_encoded = list()
for value in int_encoded:
letter = [0 for _ in range(len(aa))]
letter[value] = 1
onehot_encoded.append(letter)
return(torch.Tensor(np.transpose(onehot_encoded)))
#INITALIZE TENSORS
a=Var(torch.randn(20,1),requires_grad=True) #initalize similarity matrix - random array of 20 numbers
freq_m=Var(torch.randn(12,20),requires_grad=True)
freq_m.data=(freq_m.data-freq_m.min().data)/(freq_m.max().data-freq_m.min().data)#0 to 1 scaling
optimizer = optim.SGD([torch.nn.Parameter(a), torch.nn.Parameter(freq_m)], lr=1e-6)
loss = nn.MSELoss()
#TRAINING LOOP
epochs = 100
for i in range(epochs):
#RANDOMLY SAMPLE DATA
train = all_seq.sample(frac=.03)
names = train.index.values.tolist()
affinities = train['binding_affinity']
print('Epoch: ' + str(i))
#forward pass
iteration_loss=[]
for j, seq in enumerate(names):
sm=torch.mm(a,a.t()) #make simalirity matrix square symmetric
freq_m.data=freq_m.data/freq_m.data.sum(1,keepdim=True) #sum of each row must be 1 (sum of probabilities of each amino acid at each position)
affin_score = affinities[j]
new_m = torch.mm(p_one_hot(seq), freq_m)
tss_m = new_m * sm
tss_score = tss_m.sum()
sms = sm
fms = freq_m
error = loss(tss_score, torch.FloatTensor(torch.Tensor([affin_score])))
iteration_loss.append(error.item())
optimizer.zero_grad()
error.backward()
optimizer.step()
mean = statistics.mean(iteration_loss)
stdev = statistics.stdev(iteration_loss)
print('Epoch Average Error: ' + str(mean) + '. Epoch Standard Deviation: ' + str(stdev))
iteration_loss.clear()
After each epoch, I print out the average of all errors for that epoch as well as the standard deviation. Each epoch runs through about 45,000 sequences. However, after 10 epochs I'm still not seeing any improvement with my error and I'm unsure as to why. Here is the output I am seeing:
Are there any ideas as to what I'm doing wrong? I'm new to PyTorch so any help is appreciated! Thank you!
It turns out that casting the optimizer parameters into torch.nn.Parameter() makes the tensors fail to hold on to updates, and removing this now shows a decreasing error.