Search code examples
pythonvectorizationtypeerrorkernel-density

TypeError: unsupported operand type(s) for -: 'str' and 'float' when trying to loop over a function in Python


I'm trying to make a Kernel Estimator (as given in the The Article) predict values given a vector, rather than a single value. Here is the code:

from scipy.stats import norm
import numpy as np 
import pandas as pd
import matplotlib.pyplot as plt
import math

class GKR:
    def __init__(self, x, y, b):
        self.x = x
        self.y = y
        self.b = b

    '''Implement the Gaussian Kernel'''
    def gaussian_kernel(self, z):
        return (1/math.sqrt(2*math.pi))*math.exp(-0.5*z**2)

    '''Calculate weights and return prediction'''
    def predict(self, X):
        kernels = [self.gaussian_kernel((xi-X)/self.b) for xi in self.x]
        weights = [len(self.x) * (kernel/np.sum(kernels)) for kernel in kernels]
        return np.dot(weights, self.y)/len(self.x)


llke = GKR(x, y, 0.2)

y_hat=pd.DataFrame(index=range(x.shape[0]))
for row in range(x.shape[0]):
    y_hat.iloc[row] = llke.predict(float(x.iloc[row]))
y_hat

but I get the error: TypeError: unsupported operand type(s) for -: 'str' and 'float'. From what I understand the issue is in the self.gaussian_kernel((xi-X) that it sees xi as a string for some reason. I tried to wrap it in float() function, but it doesn't help. What could be the problem? Alternatively, feel free to suggest alternative ways to apply the function predict onto a vector, rather than a single value.


Solution

  • The issue was that x and y were given in the wrong format (as lists of lists, rather than lists), which is why the interpreter could deal with it properly.