How to use gplot_na_imputations() or ggplot_na_distribution() from the package imputeTS

I have a dataframe (table with 100 rows/countries and 28 columns/months between 2020 and 2022). I used the package imputeTS and used the function na_kalman() to substitute my several NAs values by some estimated values. Everything goes fine till here. After, when I try to plot using gplot_na_imputations() or ggplot_na_distribution() an error is shown: "Input x_with_na is not numeric". I think the solution is to convert my dataframe into a time series 'ts'. Any suggestions?

This is what I have:

total_tests_imp <- na_kalman(total_tests_md)
ggplot_na_imputations(x_with_na = total_tests_md, x_with_imputations = total_tests_imp)

(ps.) when I run: class(total_tests_md) it appears:[1] "tbl_df" "tbl" "data.frame"

When I run `head(total_tests_md)´

# A tibble: 6 x 29
  countries   jan_20 fev_20 mar_20 abr_20 mai_20 jun_20 jul_20 ago_20 set_20 out_20 nov_20 dez_20 jan_21 fev_21 mar_21 abr_21
  <chr>        <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>
1 Afghanistan NA     NA     NA      NA     NA     NA      NA     NA     NA     NA     NA     NA      NA     NA     NA     NA 
2 Albania     NA      0.009  0.54    2.83   5.08   8.19   12.9   20.3   29.1   42.0   61.7   86.2   119.   155.   187.   214.
3 Algeria     NA     NA     NA      NA     NA     NA      NA     NA     NA     NA     NA     NA      NA     NA     NA     NA 
4 Andorra     NA     NA     NA      NA     NA     NA      NA     NA    691.  1033.  1405.  1613.   1819.  2003.  2175.  2335.
5 Angola      NA     NA     NA      NA     NA     NA      NA     NA     NA     NA     NA     NA      NA     NA     NA     NA 
6 Argentina    0.013  0.015  0.162   1.55   4.44   9.91   19.7   34.3   52.3   74.3   92.3  112.    143.   172.   204.   257.
# ... with 12 more variables: mai_21 <dbl>, jun_21 <dbl>, jul_21 <dbl>, ago_21 <dbl>, set_21 <dbl>, out_21 <dbl>,
#   nov_21 <dbl>, dez_21 <dbl>, jan_22 <dbl>, fev_22 <dbl>, mar_22 <dbl>, abr_22 <dbl>´´´

  • When you use ggplot_na_imputations or ggplot_na_distribution, you should provide vector or ts object in one dimension as it is specified in the function description :

    So you must convert your data.frame with all countries into a vector by country. Moreover, to convert a vector to time series, see there :

    Your data

    total_tests_md <- structure(list(countries = c("Afghanistan", "Albania", "Algeria", "Andorra", "Angola", "Argentina"),
    jan_20 = c(NA, NA, NA, NA, NA, 0.013),
    fev_20 = c(NA, 0.009, NA, NA, NA, 0.015),
    mar_20 = c(NA, 0.54, NA, NA, NA, 0.162),
    abr_20 = c(NA, 2.831, NA, 0.3, NA, 1.546)),
    row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame"))

    Import your libraries


    Convert your data.frame into a vector

    # remove country name
    Albania <- total_tests_md[2,-1]
    Albania <- as.numeric(Albania)
    # create month vector
    month <- seq(as.Date("2020-01-01"), as.Date("2020-04-01"), by = "month")

    When you use time series

    # reasonning with ts
    Albaniats <- zoo(Albania, month)
    AlbaniatsInput <- Albaniats
    AlbaniatsInput[1] <- 0.5
    ggplot_na_imputations(x_with_na = Albaniats,
                          x_with_imputations = AlbaniatsInput,
                          x_axis_labels = index(Albaniats))
    When you use only vector

    #reasoning with numeric vector
    AlbaniaInput <- Albania
    AlbaniaInput[1] <- 0.5
    ggplot_na_imputations(x_with_na = Albania,
                          x_with_imputations = AlbaniaInput,
                          x_axis_labels = month)
