Search code examples
rplotmissing-datalocf

Plotting missing data


I'm trying plotting the following imputed dataset with LOCF method, according this procedure

> dati
# A tibble: 27 x 6
      id sex      d8   d10   d12   d14
   <dbl> <chr> <dbl> <dbl> <dbl> <dbl>
 1     1 F      21    20    21.5  23  
 2     2 F      21    21.5  24    25.5
 3     3 NA     NA    24    NA    26  
 4     4 F      23.5  24.5  25    26.5
 5     5 F      21.5  23    22.5  23.5
 6     6 F      20    21    21    22.5
 7     7 F      21.5  22.5  23    25  
 8     8 F      23    23    23.5  24  
 9     9 F      NA    21    NA    21.5
10    10 F      16.5  19    19    19.5
# ... with 17 more rows

dati_locf <- dati %>% mutate(across(everything(),na.locf)) %>%
  mutate(across(everything(),na.locf,fromlast = T))

apply(dati_locf[which(dati_locf$sex=="F"),1:4], 1, function(x) lines(x, col = "green"))

Howrever, when I run the last line to plot dataset it turns me back both these error and warning messages:

Warning in xy.coords(x, y) : a NA has been produced by coercion
Error in plot.xy(xy.coords(x, y), type = type, ...) : 
  plot.new has not been called yet
Called from: plot.xy(xy.coords(x, y), type = type, ...)

Can you explain why and how I could fix them? I let you attach the page I has been being address to after running it. enter image description here


Solution

  • If you just want to plot the LOCF imputation for one variable to see how good the fit for the imputations looks for this one variable, you can use the following:

    library(imputeTS)
    # Example 1: Visualize imputation by LOCF
    imp_locf <- na_locf(tsAirgap)
    ggplot_na_imputations(tsAirgap, imp_locf)
    

    enter image description here

    tsAirgap is an time series example, which comes with the imputeTS package. You would have to replace this with the time series / variable you want to plot. Imputed values are shown in red. As you can see, for this series last observation carried forward would be kind of ok, but there are algorithms tat come with the imputeTS package, that give a better result (e.g. na_kalman or na_seadec). Here is also an example of next observation carried backward, since you also used NOCB.

    library(imputeTS)
    # Example 2: Visualize imputation by NOCB
    imp_locf <- na_locf(tsAirgap, option = "nocb")
    ggplot_na_imputations(tsAirgap, imp_locf)
    

    enter image description here