I'm trying to make a Kernel Estimator (as given in the The Article) predict values given a vector, rather than a single value. Here is the code:
from scipy.stats import norm
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import math
class GKR:
def __init__(self, x, y, b):
self.x = x
self.y = y
self.b = b
'''Implement the Gaussian Kernel'''
def gaussian_kernel(self, z):
return (1/math.sqrt(2*math.pi))*math.exp(-0.5*z**2)
'''Calculate weights and return prediction'''
def predict(self, X):
kernels = [self.gaussian_kernel((xi-X)/self.b) for xi in self.x]
weights = [len(self.x) * (kernel/np.sum(kernels)) for kernel in kernels]
return np.dot(weights, self.y)/len(self.x)
llke = GKR(x, y, 0.2)
y_hat=pd.DataFrame(index=range(x.shape[0]))
for row in range(x.shape[0]):
y_hat.iloc[row] = llke.predict(float(x.iloc[row]))
y_hat
but I get the error: TypeError: unsupported operand type(s) for -: 'str' and 'float'
. From what I understand the issue is in the self.gaussian_kernel((xi-X)
that it sees xi
as a string for some reason. I tried to wrap it in float() function, but it doesn't help. What could be the problem? Alternatively, feel free to suggest alternative ways to apply the function predict onto a vector, rather than a single value.
The issue was that x
and y
were given in the wrong format (as lists of lists, rather than lists), which is why the interpreter could deal with it properly.