Univariate, binary ``loess`` vs ``lowess``

Consider the following data:

probs <- seq(0,0.3, by=0.001)

targets <- sapply(probs, function(p){
  sample(c(0,1),size=1,prob = c(1-p,p))
})

Using loess I can then graph the smoothed "targets" values as estimates of the probabilities:

require(magrittr)
loess(targets~probs,span=0.3) %>% predict %>% {plot(. ~ probs)}

However, I am not able to do that using lowess, despite whatever f value is chosen:

lowess(x = probs, y = targets, f = 0.01) %>% with(plot(y ~ x))

My questions: why do the results differ? Is there any way to achieve the same output from lowess that would match the loess one?

As per numerous threads on SO, it would seem that for univariate cases loess and lowess should match.

Unrelated side note: why won't I use loess then? The goal is to understand the differences between lowess and loess. Furthermore, I would like to reapply the results using Python's statsmodels, which, to my knowledge, provide only lowess.

Solution

It's easier to generate your random sample using rbinom:

probs <- seq(0, 0.3, by = 0.001)

set.seed(1)
targets <- rbinom(301, 1, probs)

The loess smooth looks like this:

est_loess <- loess(targets ~ probs, span = 0.3) |> predict()

plot(probs, est_loess, type = "l")

If you want a similar result from lowess, try setting iter to 0:

est_lowess <- lowess(x = probs, y = targets, f = 0.2, iter = 0) 

plot(est_lowess, type = "l")

In either case, be very careful when smoothing probabilities like this. You run the risk of having nonsensical values outside the 0-1 range. Where possible, you should convert to odds, smooth these, then convert back to probabilities. One way to achieve this is to use gam with family = "binomial"

library(mgcv)

est_gam <- gam(targets ~ s(probs, k = 100, m = 1), gamma = 0.9,
               family = binomial) |>
  predict(type = "response")

plot(probs, est_gam, type = "l", ylim = c(0, 0.3))

^{Created on 2023-09-06 with reprex v2.0.2}