I am trying to write a code to plot Weibull probability. I want to find the scale parameter which is the x-intercept where the fitted line intersects with probability = 63.2%. Also the slope is very high which might be unusual. Is it a problem with the dataset or I am plotting the whole Weibull probability distribution wrong?
import numpy as np
import matplotlib.pyplot as plt
# Define the parameters
n = 8 # Sample size
i = np.arange(1, n + 1) # Rank of observations
E = np.array([208.9, 209.0, 209.2, 209.3, 209.6, 209.8, 209.9, 210]) # Electric breakdown strength
# Calculate the cumulative breakdown efficiency by the given formula
p = (i - 0.5) / (n + 0.25)
# variables
x_data = np.log(E)
y_data = np.log(-np.log(1 - p))
# Fit a linear regression to estimate the parameters of the Weibull distribution
slope, intercept = np.polyfit(x_data, y_data, 1)
# Calculate the shape parameter
shape = slope
# scale parameter is the x-intercept where the fitted line intersects with p = 63.2%. How to get that?
# Generate points for the Weibull distribution
x = np.linspace(min(x_data), max(y_data), 100)
y = slope * x + intercept
# Plot the Weibull distribution
plt.plot(x_data, y_data, 'o')
plt.plot(x, y, label='Linear fit')
plt.xlabel('ln (E)')
plt.ylabel('Probability of failure :ln(-ln(1 - p))')
plt.title('Weibull Plot')
plt.legend()
plt.show()
print("Shape parameter:", slope)
# print("Scale parameter:", ?)
The CDF of the Weibull distribution is F=1-exp(-(x/L)**k)
where, k
="shape" and L
="scale" in your notation.
Hence, log[-log(1-F)]=k.log(x)-k.log(L)
, so, if you plotted the LHS against log(x)
you would have slope k
and y-intercept -k.log(L)
. Thus, you can get the scale parameter L
from exp(-intercept/k)
.
I've put this in the code below and included it in the plot as orange crosses.
However, the Weibull distribution is not a sensible fit to your data as it stands - it makes more sense for x values from 0 to a few times scale
. The only way I could make sense of fitting a Weibull distribution here was to apply it to E-208.8
(the offset is a guess, based on your data and the need to keep it positive).
Code and output below:
import numpy as np
import matplotlib.pyplot as plt
# Define the parameters
E = np.array([208.9, 209.0, 209.2, 209.3, 209.6, 209.8, 209.9, 210]) # Electric breakdown strength
E0 = 208.8 # bit of a guess!
E = E - E0
n = len( E ) # Sample size
i = np.arange(1, n + 1) # Rank of observations
# Calculate the cumulative breakdown efficiency by the given formula
p = (i - 0.5) / (n + 0.25)
# variables
x_data = np.log(E)
y_data = np.log(-np.log(1 - p))
# Fit a linear regression to estimate the parameters of the Weibull distribution
slope, intercept = np.polyfit(x_data, y_data, 1)
# Calculate the shape parameter
shape = slope
scale = np.exp( -intercept / shape ) # -k ln( lambda ) = intercept
print("Shape parameter:", slope)
print("Scale parameter:", scale)
# Represent the (scaled) fitted Weibull distribution
Weibull_CDF = 1.0 - np.exp( -( E / scale ) ** shape )
z = np.log( -np.log( 1.0 - Weibull_CDF ) )
# Generate points for the Weibull distribution
x = np.linspace(min(x_data), max(y_data), 100)
y = slope * x + intercept
# Plot
plt.plot(x_data, y_data, 'o')
plt.plot(x_data, z, 'x', label="Weibull" )
plt.plot(x, y, label='Linear fit')
plt.xlabel('ln (E-E0)')
plt.ylabel('Probability of failure :ln(-ln(1 - p))')
plt.title('Weibull Plot')
plt.legend()
plt.show()
Values:
Shape parameter: 1.3061117692466773
Scale parameter: 0.8009805959686129