Search code examples
pandasql

Pandasql not working when I add an analytical function


I'm having an issue with the pandasql library. The library works great until I attempt an analytical function in which case I get the error:

**Error message**
OperationalError: near "(": syntax error

The table and code used is as follow:

question_id, average
ksjhf          324
9nsof          245
oi2hf          674
from pandasql import sqldf
 
q1 = """SELECT *, ROW_NUMBER() OVER ( PARTITION BY question_id ORDER BY average) AS question_number FROM ordered"""

sqldf( q1 )

Solution

  • pandasql uses a sqlite in memory database by default. sqlite only supports analytic functions as of version 3.25.0. You must upgrade sqlite. If you're in a Google Colab notebook, this should do it. Be sure to restart your runtime.

    If you're not using Colab, using a later version of Python would be the easiest way to go if that's an option.