Search code examples
rsubsetdayshour

subset data for a day if data between two hours of the day meets criteria?


I’m fairly new to R and it would be great if you could help out with this problem as i havent been able to find any answers to this problem online. This is part of my data frame (DF) (it goes on until 2008 in this format)

Counter Date    Hour    counts
1245    26/05/2006  0   1
1245    26/05/2006  100 0
1245    26/05/2006  200 2
1245    26/05/2006  300 0
1245    26/05/2006  400 5
1245    26/05/2006  500 3
1245    26/05/2006  600 9
1245    26/05/2006  700 10
1245    26/05/2006  800 15

This is my question: I need to subset my code so that between the hours of 600 and 2200 if there are counts over 0 then I need to keep the whole day (000 to 2300) in the data set, but if there are no counts in the specified time period (600 to 2200) then the whole day needs to be deleted. How can I do this?

I tried to do this with the following piece of code, although it takes ONLY the counts data between 600 and 2200 hours and i can't figure out how to make it take the whole day.

DF2=DF[(DF$hour>=600)&(DF$hour<=2200)&(DF$counts>0),] ##16hr worth of counts from 600 to 2200

I’m then subsetting the data where hourly counts are aggregated into daily counts using the following code

daily=subset(DF2)
    daily$date = as.Date(daily$date, "%m/%d/%Y") 
    agg=aggregate(counts~ date, daily, sum)
town=merge(agg,DF2$counter,all=TRUE) 

Thank you so much for your help in advance, Katie


Solution

  • Try this:

    TDF <- subset(DF, hour>=600 & hour<=2200)
    # get dates where there at least one hour with count data in range
    dates <- subset(aggregate(counts~Date,TDF,sum),counts>0)$Date
    # get dates where there are no hours with zero count
    dates2 <- subset(aggregate(counts~Date,TDF,prod),counts>0)$Date
    
    DF2 <- subset(DF,Date %in% dates)
    DF3 <- subset(DF,Date %in% dates2)