Search code examples
rrrdtoolrrd

Error when trying the remove NaN


I'm using the Rrd package for R and I'm importing an rrd file, and I wish to delete all record that have NaN as a result.

 head(rra)

                timestamp curr_proc_units entitled_cycles capped_cycles
1480982460 2016-12-05 18:01:00             NaN             NaN           NaN
1480982520 2016-12-05 18:02:00             NaN             NaN           NaN
1480982580 2016-12-05 18:03:00             NaN             NaN           NaN
1480982640 2016-12-05 18:04:00             NaN             NaN           NaN
1480982700 2016-12-05 18:05:00             NaN             NaN           NaN
1480982760 2016-12-05 18:06:00             NaN             NaN           NaN
       uncapped_cycles
1480982460             NaN
1480982520             NaN
1480982580             NaN
1480982640             NaN
1480982700             NaN
1480982760             NaN

The head is all NaN but the rest are not.

#!/usr/bin/env Rscript

# libraries
library(lubridate, quietly = TRUE)
library(plyr, quietly = TRUE)
library(dplyr, quietly = TRUE)
library(chron, quietly = TRUE)
library(ggplot2, quietly = TRUE)
library(Rrd, quietly = TRUE)
library(plyrmr, quietly = TRUE)

rra = importRRD("/kathryn/rdc1vsip8/rdc1vsiphmc3/rdc1vpc1lpr56.rrm", "AVERAGE", 1480982400, 1486598400, 2)

rra$timestamp <- as.POSIXct(as.numeric(rra$timestamp), origin = "1970-01-01")

rra = rra[!is.nan(rra)];

My error is: Error in is.nan(rra) : default method not implemented for type 'list'

So how do I convert my list into something which I can take out the NaN values?


Solution

  • Here's a reproducible version of your dataset.

    timestamps <- seq(Sys.time() - 3600, Sys.time(), by = "1 min")
    n <- length(timestamps)
    rra <- data.frame(
      timestamp = timestamps,
      curr_proc_units = runif(n),
      entitled_cycles = runif(n)
    )
    rra <- within(
      rra,
      {
        curr_proc_units[sample(n, 10)] <- NaN
        entitled_cycles[sample(n, 10)] <- NaN
      }
    )
    

    Here's a solution using dplyr's filter() function.

    library(dplyr)
    rra %>% 
      filter(
        !is.nan(curr_proc_units),
        !is.nan(entitled_cycles)
      )