I am trying to create a surface plot of the error function in linear regression. I do it like this:
class LinearRegression:
def __init__(self):
self.data = pd.read_csv("data.csv")
def computeCost(self):
j = 0.5 * (
(self.data.hypothesis - self.data.y)**2).sum() / self.data.y.size
return j
def regress(self, theta0, theta1):
self.data["hypothesis"] = theta1 * self.data.x + theta0
def plotCostFunction3D(self):
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
theta0_vals = np.linspace(-100, 100, 200)
theta1_vals = np.linspace(-100, 100, 200)
costs = []
for theta0 in theta0_vals:
for theta1 in theta1_vals:
self.regress(theta0, theta1)
costs.append(self.computeCost())
ax.plot_surface(
theta0_vals,
theta1_vals,
np.array(costs),
)
if __name__ == "__main__":
regression = LinearRegression()
regression.plotCostFunction3D()
plt.show()
I get the following error:
ValueError: Argument Z must be 2-dimensional.
I am aware that I need to use np.meshgrid
for theta0_vals
and theta1_vals
, but I'm not sure how to compute the costs from those results. How would I go about it?
The error is caused by the method call ax.plot_surface(theta0_vals, theta1_vals, np.array(costs))
, because Axes3D.plot_surface(X, Y, Z)
expects its arguments to be two-dimensional arrays.
So as you note, np.meshgrid()
should be used to compute the grid spanned by theta0_vals
and theta1_vals
. Regarding Z
, you have already computed the cost at every point of the grid using the nested for
loops, so you just need to turn the one-dimensional costs
list into a two-dimensional array corresponding to the X-Y grid. This can be done with np.reshape()
.
def plotCostFunction3D(self):
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
theta0_vals = np.linspace(-100, 100, 200)
theta1_vals = np.linspace(-100, 100, 200)
costs = []
for theta0 in theta0_vals:
for theta1 in theta1_vals:
self.regress(theta0, theta1)
costs.append(self.computeCost())
X, Y = np.meshgrid(theta0_vals, theta1_vals)
Z = np.reshape(costs, (200, 200))
ax.plot_surface(X, Y, Z)
For better performance, it would be nice to avoid the nested for
loops. You could store the X-Y grid points in a dataframe and then compute the Z column with df.apply()
.