I would like to run a LOWESS function where different data points have different weights, but I don't see how I can pass weights to the lowess
function. Here's some example code of using lowess
without weights.
import numpy as np
import statsmodels.api as sm
import matplotlib.pyplot as plt
# Create the data
x = np.random.uniform(low=-2*np.pi, high=2*np.pi, size=500)
y = np.sin(x) + np.random.normal(size=len(x))
# Apply LOWESS (Locally Weighted Scatterplot Smoothing)
lowess = sm.nonparametric.lowess
z = lowess(y, x)
w = lowess(y, x, frac=1./3)
# Plotting
plt.figure(figsize=(12, 6))
plt.scatter(x, y, label='Data', alpha=0.5)
plt.plot(z[:, 0], z[:, 1], label='LOWESS', color='red')
My points vary in significance, so I would like to be able to create weights like weights = p.random.randint(1,5,size=500)
and have the lowess process use them. I believe this is possible in R but I'm not sure if it can be done in Python. Is there a way?
First install the package skmisc
which can perform Weighted LOESS:
python3 -m pip install scikit-misc --user
Then for a synthetic dataset:
import numpy as np
from skmisc.loess import loess
import matplotlib.pyplot as plt
np.random.seed(12345)
x = np.sort(np.random.uniform(low=-2*np.pi, high=2*np.pi, size=500))
y = np.sin(x)
s = np.abs(0.2 * np.random.normal(size=x.size) + 0.01)
n = s * np.random.normal(size=x.size)
yn = y + n
w = 1 / s ** 2
We create the LOESS object and feed it with data and weights:
regressor = loess(x, y, weights=w, span=0.3)
regressor.fit()
We regress the curve:
prediction = regressor.predict(x)
And display the result:
fig, axe = plt.subplots()
axe.scatter(x, yn)
axe.plot(x, prediction.values, color="orange")
axe.grid()
Notice the API of this package is a bit different from sklearn
API. There are another example of usage here.