I have a data frame df
with columns A
and Q
. I am using this code to draw a line of equation on it.
#Actual line of equation, which has to be plotted: Q=alpha*A^beta : ln(Q)=a+b*ln(A) : y = a+b(x)
x = np.log(df['A'])
y = np.log(df['Q'])
#deriving b,a
b,a = np.polyfit(np.log(x), y, 1)
#deriving alpha and beta. By using a = ln(alpha); b = beta -1
alpha = np.exp(a)
beta = b + 1
Q = df['Q'].values
A = df['A'].values
#equation of line
q = alpha * np.power(A,beta)
#plotting the points and line
plt.scatter(A,Q)
plt.plot(A,q, '-r')
plt.yscale('log')
plt.xscale('log')
This gives the following output, which is similar to a regression line.
But I am interested in plotting the same line of the equation as the upper and lower curve/boundary joining the farthest points(perpendicular to the green line) on both sides as shown below with the same slope as that of the continuous green line.
The idea is to first search the index of the point where the difference between the line and the plot is minimal (cf. maximal). With this point, alpha_min
can be calculated such that
Q[pos_min] == alpha_min * np.power(A[pos_min], beta)
, thus
alpha_min = Q[pos_min] / np.power(A[pos_min], beta)
.
As such lines can extend quite far away from the original points, it can help to restore the x and y limits (thus clipping the plot to the original region).
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
df = pd.DataFrame()
df['A'] = 10 ** np.random.uniform(0, 1, 1000) ** 2
df['Q'] = 10 ** np.random.uniform(0, 1, 1000) ** 2
x = np.log(df['A'])
y = np.log(df['Q'])
# deriving b,a
b, a = np.polyfit(np.log(x), y, 1)
# deriving alpha and beta. By using a = ln(alpha); b = beta - 1
alpha = np.exp(a)
beta = b + 1
Q = df['Q'].values
A = df['A'].values
# plotting the points and line
plt.yscale('log')
plt.xscale('log')
plt.scatter(A, Q, color='b')
# equation of line
xmin, xmax = plt.xlim() # the limits of the x-axis for drawing the line
x = np.linspace(xmin, xmax, 50)
q = alpha * np.power(x, beta)
plt.plot(x, q, '-r')
ymin, ymax = plt.ylim() # store the limits of the scatter and line plot so they can be restored later
pos_min = np.argmin(Q / np.power(A, beta))
pos_max = np.argmax(Q / np.power(A, beta))
alpha_min = Q[pos_min] / np.power(A[pos_min], beta)
alpha_max = Q[pos_max] / np.power(A[pos_max], beta)
# plt.scatter(A[pos_min], Q[pos_min], s=100, fc='none', ec='r', lw=3)
# plt.scatter(A[pos_max], Q[pos_max], s=100, fc='none', ec='g', lw=3)
plt.plot(x, (alpha_max) * np.power(x, beta), '--r')
plt.plot(x, (alpha_min) * np.power(x, beta), '--r')
plt.xlim(xmin, xmax) # restore the limits of the scatter plot
plt.ylim(ymin, ymax)
plt.show()