Search code examples
pythonsqlpandassqldfpandasql

find timestamp difference between 2 columns with sqldf


According to this answer: https://stackoverflow.com/a/25863597/12304000

We can use something like this in mysql to calculate the time diff between two cols:

SELECT TIMESTAMPDIFF(<INTERVAL>,<timestampFrom>,<timestampTo>);

How can I achieve the same thing with pandasql? I tried these:

from pandasql import sqldf
output = sqldf("select DATEDIFF(minute,startDate,completedDate) from df")
output = sqldf("select TIMESTAMPDIFF(MINUTE,startDate,completedDate) from df")

but they throw an error that:

OperationalError: no such column: MINUTE

Solution

  • From the PandaSQL documentation:

    pandasql uses SQLite syntax.

    The link in your post is for MySQL. Here is a reference for SQLite https://www.sqlite.org/lang.html

    The syntax would be like:

    "select ROUND((JULIANDAY(startDate) - JULIANDAY(completedDate)) * 1440) from df"