Search code examples
pythonnumpymatplotliblinear-regression

Line chart Of two Numpy arrays matplotlib


import matplotlib.pyplot as plt
plt.plot(x_test,y_pred)
print(x_test)
print()
print(y_pred)
print()
print(y_test)
plt.plot(x_test,y_test)
plt.show()

I'm trying to plot a line chart using above 3 arrays all the arrays are of same shape but still for some reason the line chart is coming out to be messy. I'm still a noob could anyone please tell me the mistake?

Output:

 [31.588117   53.66093226 46.68222865 43.10782022 70.34607562 44.49285588
 57.5045333  36.93007661 55.80573336 38.95476907 56.9012147  56.86890066
 34.3331247  59.04974121 57.78822399 54.28232871 51.0887199  50.28283635
 44.21174175 38.00548801 32.94047994 53.69163957 68.76573427 46.2309665
 68.31936082 50.03017434 49.23976534 50.03957594 48.14985889 25.12848465]

[ 48.07285783  77.96480957  68.51395026  63.67333329 100.56054527
  65.5490079   83.16997871  55.30716832  80.86939107  58.04909356
  82.35293882  82.30917777  51.79026483  85.26256544  83.55416479
  78.80633125  74.48140943  73.39004741  65.1683111   56.76353649
  49.90428565  78.00639472  98.42037934  67.90283173  97.8158813
  73.0478817   71.97747598  73.06061375  70.50147799  39.32494706]

[ 50.39267014  63.64239878  72.24725107  57.81251298 104.25710159
  86.64202032  91.486778    55.23166089  79.55043668  44.84712424
  80.20752314  83.14274979  55.72348926  77.63418251  99.05141484
  79.12064627  69.58889785  69.51050331  73.68756432  61.36690454
  67.17065577  85.66820315 114.85387123  90.12357207  97.91982104
  81.53699078  72.11183247  85.23200734  66.22495789  53.45439421]

enter image description here


Solution

  • Since your x-axis is not sorted, the line will be messy. It plots exactly what you tell it to the plot. If you want a line graph, you'll need to sort your data first (while making sure your other vectors are sorted accordingly). You can easily do that with Pandas, for example.

    import pandas as pd
    df = pd.DataFrame({'x_test': x_test,
                       'y_test': y_test,
                       'y_pred': y_pred})
    
    df.sort_values('x_test', inplace=True)
    
    plt.plot(df.x_test, df.y_pred)
    plt.plot(df.x_test, df.y_test)
    plt.show()
    

    lineplot result

    Alternatively, as mentioned before, you're probably better of with a scatter plot:

    plt.figure()
    plt.plot(df.x_test, df.y_pred)
    plt.scatter(df.x_test, df.y_test)
    plt.show()
    

    scatterplot result