I am trying to code an optimizer finding the optimal constant parameters so as to minimize the MSE between an array y and a generic function over X. The generic function is given in pre-order, so for example if the function over X is x1 + c*x2 the function would be [+, x1, *, c, x2]. The objective in the previous example, would be minimizing:
sum_for_all_x (y - (x1 + c*x2))^2
I show next what I have done to solve the problem. Some things that sould be known are:
def loss(self, constants, X, y):
stack = [] # Stack to save the partial results
const = 0 # Index of constant to be used
for idx in self.traversal[::-1]: # Reverse the prefix notation
if idx > Language.max_variables: # If we are dealing with an operator
function = Language.idx_to_token[idx] # Get its associated function
first_operand = stack.pop() # Get first operand
if function.arity == 1: # If the arity of the operator is one (e.g sin)
stack.append(function.function(first_operand)) # Append result
else: # Same but if arity is 2
second_operand = stack.pop() # Need a second operand
stack.append(function.function(first_operand, second_operand))
elif idx == 0: # If it is a constant -> idx 0 indicates a constant
stack.append(constants[const]*torch.ones(X.shape[0])) # Append constant
const += 1 # Update
else:
stack.append(X[:, idx - 1]) # Else append the associated column of X
prediction = stack[0]
return (y - prediction).pow(2).mean().cpu().numpy()
def optimize_constants(self, X, y):
'''
# This function optimizes the constants of the expression tree.
'''
if 0 not in self.traversal: # If there are no constants to be optimized return
return self.traversal
x0 = [0 for i in range(len(self.constants))] # Initial guess
ini = time.time()
res = minimize(self.loss, x0, args=(X, y), method='BFGS', options={'disp': True})
print(res)
print('Time:', time.time() - ini)
The problem is that the optimizer theoretically terminates successfully but does not iterate at all. The output res would be something like that:
Optimization terminated successfully.
Current function value: 2.920725
Iterations: 0
Function evaluations: 2
Gradient evaluations: 1
fun: 2.9207253456115723
hess_inv: array([[1]])
jac: array([0.])
message: 'Optimization terminated successfully.'
nfev: 2
nit: 0
njev: 1
status: 0
success: True
x: array([0.])
So far I have tried to:
It seems that scipy optimize minimize does not work well with Pytorch. Changing the code to use numpy ndarrays solved the problem.