Search code examples
pythonmatplotliblinear-regressionheatmapsurface

Fitted surface does not resemble the heatmap produced from the same data


I have some z values spanning over a grid of, say, 8 x and 10 z values. The z values can be depicted as a heatmap over the x-y plane (left subplot). Now I want to fit a surface to the same data that are shown in the heatmap. However, the fitted surface (right subplot) does not at all look like the heatmap. What am I doing wrong here? Also, if I remove the .T from the contour1 = ... line (as some have suggested), I get a type error because the shapes do not match.

Here's the code:

import matplotlib.pyplot as plt
import numpy as np
from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression

x_values = np.linspace(1, 8, 8)
y_values = np.linspace(1, 10, 10)
z_values = [[0.0128, 0.0029, 0.0009, 0.0006, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000],
            [0.0049, 0.0157, 0.0067, 0.0003, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000],
            [0.0203, 0.0096, 0.0055, 0.0096, 0.0012, 0.0023, 0.0000, 0.0000, 0.0000, 0.0000],
            [0.0229, 0.0191, 0.0020, 0.0073, 0.0055, 0.0026, 0.0022, 0.0000, 0.0000, 0.0000],
            [0.0218, 0.0357, 0.0035, 0.0133, 0.0073, 0.0145, 0.0000, 0.0029, 0.0000, 0.0000],
            [0.0261, 0.0232, 0.0365, 0.0200, 0.0212, 0.0107, 0.0036, 0.0007, 0.0022, 0.0007],
            [0.0305, 0.0244, 0.0284, 0.0786, 0.0226, 0.0160, 0.0000, 0.0196, 0.0007, 0.0007],
            [0.0171, 0.0189, 0.0598, 0.0215, 0.0218, 0.0464, 0.0399, 0.0051, 0.0000, 0.0000]]
z_values = np.array(z_values)

x_grid, y_grid = np.meshgrid(x_values, y_values)

fig = plt.figure(figsize=(12, 5))

ax1 = fig.add_subplot(121)
contour1 = ax1.contourf(x_grid, y_grid, np.log(z_values.T + 1))
fig.colorbar(contour1, ax=ax1)
ax1.set_xlabel('x values')
ax1.set_ylabel('y values')

x_flat = x_grid.flatten()
y_flat = y_grid.flatten()
z_flat = z_values.flatten()

degree = 4
poly_features = PolynomialFeatures(degree=degree)
X_poly = poly_features.fit_transform(np.column_stack((x_flat, y_flat)))

model = LinearRegression()
model.fit(X_poly, z_flat)

z_pred = model.predict(X_poly)
z_pred_grid = z_pred.reshape(x_grid.shape)

ax2 = fig.add_subplot(122, projection='3d')
ax2.plot_surface(x_grid, y_grid, np.log(z_pred_grid + 1), cmap='viridis')
ax2.set_xlabel('x values')
ax2.set_ylabel('y values')
ax2.set_zlabel('z values')

plt.show()

And here's the figure:

enter image description here


Solution

  • EDITED FOLLOWING COMMENTS

    If you use the default numpy.meshgrid call then the layout of your x and y values with respect to the dependent-variable array corresponds to the following: (print x_grid, y_grid and z_values out if you want to see):

        x ------->
     y
     |   [dependent-]
     |   [variable  ]
    \|/  [ array    ]
    

    Your poly-fitted figure corresponds to this, but your heat-map doesn't. I think that's because you transposed z in the heatmap plot but you didn't transpose z in the polyfit.

    Whatever you do, you need to use the SAME dependent-variable array in both figures.

    You have TWO choices: either (1) transpose the dependent-variable array before any plotting and leave the meshgrid call as it is; or (2) leave the dependent-variable array as it is but change the indexing order in meshgrid.

    Full code below. Please note the 2 options (given by easily-changed variable option).

    The output figure is the same in both cases. The peaks of the diagrams now correspond between the two subfigures. They are at high-x/low-y.

    import matplotlib.pyplot as plt
    import numpy as np
    from sklearn.preprocessing import PolynomialFeatures
    from sklearn.linear_model import LinearRegression
    
    x_values = np.linspace(1, 8, 8)  
    y_values = np.linspace(1, 10, 10)
    
    z_values = [[0.0128, 0.0029, 0.0009, 0.0006, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000],
                [0.0049, 0.0157, 0.0067, 0.0003, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000],
                [0.0203, 0.0096, 0.0055, 0.0096, 0.0012, 0.0023, 0.0000, 0.0000, 0.0000, 0.0000],
                [0.0229, 0.0191, 0.0020, 0.0073, 0.0055, 0.0026, 0.0022, 0.0000, 0.0000, 0.0000],
                [0.0218, 0.0357, 0.0035, 0.0133, 0.0073, 0.0145, 0.0000, 0.0029, 0.0000, 0.0000],
                [0.0261, 0.0232, 0.0365, 0.0200, 0.0212, 0.0107, 0.0036, 0.0007, 0.0022, 0.0007],
                [0.0305, 0.0244, 0.0284, 0.0786, 0.0226, 0.0160, 0.0000, 0.0196, 0.0007, 0.0007],
                [0.0171, 0.0189, 0.0598, 0.0215, 0.0218, 0.0464, 0.0399, 0.0051, 0.0000, 0.0000]]
    
    #### If you want x to vary with ROW rather than the default COLUMN then choose ONE of these two approaches
    option = 2
    if ( option == 1 ):        # Default meshgrid, but transpose the dependent variables
       z_values = np.array(z_values).T
       x_grid, y_grid = np.meshgrid(x_values, y_values)
    else:                      # Leave the dependent variable, but use non-default meshgrid arrangement: indexing='ij'
       z_values = np.array(z_values)
       x_grid, y_grid = np.meshgrid(x_values, y_values,indexing='ij')
    
    
    fig = plt.figure(figsize=(12, 5))
    ax1 = fig.add_subplot(121)
    contour1 = ax1.contourf(x_grid, y_grid, np.log(z_values + 1))
    fig.colorbar(contour1, ax=ax1)
    ax1.set_xlabel('x values')
    ax1.set_ylabel('y values')
    
    x_flat = x_grid.flatten()
    y_flat = y_grid.flatten()
    z_flat = z_values.flatten()
    
    degree = 4
    poly_features = PolynomialFeatures(degree=degree)
    X_poly = poly_features.fit_transform(np.column_stack((x_flat, y_flat)))
    
    model = LinearRegression()
    model.fit(X_poly, z_flat)
    
    z_pred = model.predict(X_poly)
    z_pred_grid = z_pred.reshape(x_grid.shape)
    
    ax2 = fig.add_subplot(122, projection='3d')
    ax2.plot_surface(x_grid, y_grid, np.log(z_pred_grid + 1), cmap='viridis')
    ax2.set_xlabel('x values')
    ax2.set_ylabel('y values')
    ax2.set_zlabel('z values')
    
    plt.show()
    

    Output: enter image description here