Search code examples
rdateposixct

The smallest date of `v` which makes the difference `w-v` positive


From these vectors of dates

v<-c("2019-12-06 01:32:30 UTC","2019-12-31 18:44:31 UTC","2020-01-29 22:18:25 UTC","2020-03-22 16:44:29 UTC")
v<-as.POSIXct(v)
w<-c("2019-12-07 00:11:46","2020-01-01 05:29:45","2019-12-08 02:54:10","2020-03-23 07:48:26","2020-02-02 16:58:16","2020-01-31 06:46:46")
w<-as.POSIXct(w)

I would like to obtain a dataframe of two columns. One of them is just w. The second is built on v entries so that in the row there is the smallest date of v which makes the difference w-v positive. For example, the difference

w-rep(v[1],length(w))
Time differences in hours
[1]   22.65444  627.95417   49.36111 2598.26556 1407.42944 1349.23778

Then, if the second column of the desired dataframe is w, then the first one has at the first row the date 2019-12-06 01:32:30 UTC. The operation should be:

date <- w-rep(v[1],length(w))
v[date==min(date[date>0])]

Then the first row of the dataframe should be

2019-12-06 01:32:30 UTC, 2019-12-07 00:11:46

How could I build the others row wihtout using loops?


Solution

  • How about this:

    o <- outer(w, v, `-`)
    o
    # Time differences in hours
    #            [,1]       [,2]        [,3]        [,4]
    # [1,]   22.65444 -594.54583 -1294.11083 -2559.54528
    # [2,]  627.95417   10.75389  -688.81111 -1954.24556
    # [3,]   49.36111 -567.83917 -1267.40417 -2532.83861
    # [4,] 2597.26556 1980.06528  1280.50028    15.06583
    # [5,] 1407.42944  790.22917    90.66417 -1174.77028
    # [6,] 1349.23778  732.03750    32.47250 -1232.96194
    

    We don't want negative values, so

    o[o < 0] <- NA
    o
    # Time differences in hours
    #            [,1]       [,2]       [,3]     [,4]
    # [1,]   22.65444         NA         NA       NA
    # [2,]  627.95417   10.75389         NA       NA
    # [3,]   49.36111         NA         NA       NA
    # [4,] 2597.26556 1980.06528 1280.50028 15.06583
    # [5,] 1407.42944  790.22917   90.66417       NA
    # [6,] 1349.23778  732.03750   32.47250       NA
    

    Now just apply which.min on each row, then subset v on this value:

    apply(o, 1, which.min)
    # [1] 1 2 1 4 3 3
    v[apply(o, 1, which.min)]
    # [1] "2019-12-06 01:32:30 EST" "2019-12-31 18:44:31 EST" "2019-12-06 01:32:30 EST" "2020-03-22 16:44:29 EDT"
    # [5] "2020-01-29 22:18:25 EST" "2020-01-29 22:18:25 EST"
    data.frame(w=w, v2=v[apply(o, 1, which.min)])
    #                     w                  v2
    # 1 2019-12-07 00:11:46 2019-12-06 01:32:30
    # 2 2020-01-01 05:29:45 2019-12-31 18:44:31
    # 3 2019-12-08 02:54:10 2019-12-06 01:32:30
    # 4 2020-03-23 07:48:26 2020-03-22 16:44:29
    # 5 2020-02-02 16:58:16 2020-01-29 22:18:25
    # 6 2020-01-31 06:46:46 2020-01-29 22:18:25