I have a question related to interpolation. I have 2 columns ($1 is time in seconds, the other sea level). The examples I have tried mostly come from columns as date e.g. 1970-11-11, but I have records as seconds that I want to linearly interpolate to minutes. Sampling is originally every 0.3 second. Any suggestions please about which package is the best? In the following it is generating a big matrix but not reducing the quantity of values as expected. Format is just 2 cols. Trying to use in a further analysis, with data sampled not every 0.1 sec but 1 minute.
set.seed(1);
time <- rep(seq(0,180,by=0.1));
sl <-runif(1801,-0.1,4.0);
data1 <- cbind2(time,sl);
#Output needed...
time(min) sl(cm)
#Examples tried:
time<-data1$V1
SL<-data1$V2
seq1 <- zoo(order.by=((seq(min(time), max(time), by=30))))
mer1 <- merge(zoo(x=data1[1:2],order.by=time), seq1)
#Linear interpolation
dataL <- na.approx(mer1)
Here's one solution. This approach does not use any linear interpolation, but takes the average centered on each minute.
library(dplyr) # for group_by and summarize
colnames(data1) <- c("time", "sl") # makes it easier to call variables by names
data1 <- as.data.frame(data1)
data1$minute <- round(data1$time/60,0) #
head(data1)
# time sl minute
# 1 0.0 0.9885855 0
# 2 0.1 1.4257080 0
# 3 0.2 2.2486988 0
# 4 0.3 3.6236519 0
# 5 0.4 0.7268959 0
# 6 0.5 3.5833977 0
data_by_minute <- data1 %>%
group_by(minute) %>%
summarize(sl_avg = mean(sl))
data_by_minute
# # A tibble: 4 x 2
# minute sl_avg
# <dbl> <dbl>
# 1 0 1.91
# 2 1 1.98
# 3 2 1.87
# 4 3 1.96
An alternative approach if you just want to take the actual readings once a minute, rather than computing the average:
data1[data1$time%%60==0,] # only returns the observations on the minute. throws everything else out
# time sl
# 1 0 0.9885855
# 601 60 3.2384322
# 1201 120 1.4027590
# 1801 180 0.1525986
Or if you are looking for an interpolated value you could use:
minutes <- time/60 # calculate minutes based on the time variable
mod_leoss <- loess(minutes~sl) # fit a loess model to your data, this is essentially a smoothed version of your sl data based on time
Minute <- c(0,1,2,3) # minutes for which you want a predicaiton
SL_Preds <- predict(mod_leoss, Minute) # calculate values from the model
tableA <- cbind(Minute, SL_Preds)
tableA
# Minute SL_Preds
# [1,] 0 1.665899
# [2,] 1 1.463291
# [3,] 2 1.445809
# [4,] 3 1.498165