Search code examples
dataframejulialag

julia Handling time difference in dataframe


I have a dataframe with a date time column. As you can see below, I've managed to read it and even transform the date.

using CSV
using Dates
using DataFrames
using DataFramesMeta
csv = CSV.read(IOBuffer("""
date,Amount
2023-12-27 18:40,4
2023-12-27 18:45,254
2023-12-27 18:50,24
2023-12-27 18:55,24
2023-12-27 19:00,5
"""),DataFrame)
transform!(csv,[:date].=>ByRow(s->DateTime(s,dateformat"y-m-d H:M"))=>:date)

But I want to calculate the time difference between rows. I'm coming from R where I'd write:

df |> mutate(diff=date-lag(date)) # (etc)

I've no idea how to do this in Julia. I figure it's another transform, but the lag has got me bamboozled! Any help appreciated


Solution

  • How about:

    csv.diff = vcat([0],[csv.date[i]-csv.date[i-1] for i=2:length(csv.date)])
    

    This adds a new column called diff into the DataFrame csv that is the difference between each successive row, initializing the first one as zero because there is no row before it to compare (but you could also define that differently if you want to).