Search code examples
ruby-on-railspostgresqlrails-activerecordcommon-table-expression

Postgres Common Table Expression query with Ruby on Rails


I'm trying to find the best way to do a Postgres query with Common Table Expressions in a Rails app, knowing that apparently ActiveRecord doesn't support CTEs.

I have a table called user_activity_transitions which contains a series of records of a user activity being started and stopped (each row refers to a change of state: e.g started or stopped).

One user_activity_id might have a lot of couples started-stopped, which are in 2 different rows. It's also possible that there is only "started" if the activity is currently going on and hasn't been stopped. The sort_key starts at 0 with the first ever state and increments by 10 for each state change.

id      to_state     sort_key     user_activity_id    created_at
1       started      0            18                  2014-11-15 16:56:00
2       stopped      10           18                  2014-11-15 16:57:00
3       started      20           18                  2014-11-15 16:58:00
4       stopped      30           18                  2014-11-15 16:59:00
5       started      40           18                  2014-11-15 17:00:00

What I want is the following output, grouping couples of started-stopped together to be able to calculate duration etc.

user_activity_id     started_created_at      stopped_created_at
18                   2014-11-15 16:56:00     2014-11-15 16:57:00
18                   2014-11-15 16:58:00     2014-11-15 16:59:00
18                   2014-11-15 17:00:00     null

The way the table is implemented makes it much harder to run that query but much more flexible for future changes (e.g new intermediary states), so that's not going to be revised.

My Postgres query (and the associated code in Rails):

query = <<-SQL
    with started as (
    select 
        id,
        sort_key,
        user_activity_id,
        created_at as started_created_at
    from
        user_activity_transitions
    where  
        sort_key % 4 = 0
    ), stopped as (
    select 
        id,
        sort_key-10 as sort_key2,
        user_activity_id,
        created_at as stopped_created_at
    from
    user_activity_transitions
    where
        sort_key % 4 = 2
    )
    select
        started.user_activity_id AS user_activity_id,
        started.started_created_at AS started_created_at,
        stopped.stopped_created_at AS stopped_created_at
    FROM
        started
    left join stopped on stopped.sort_key2 = started.sort_key
    and stopped.user_activity_id = started.user_activity_id
SQL

results = ActiveRecord::Base.connection.execute(query)

What it does is "trick" SQL into joining 2 consecutive rows based on a modulus check on the sort key.

The query works fine. But using this raw AR call annoys me, especially since what connection.execute returns is quite messy. I basically need to loop through the results and put it in the right hash.

2 questions:

  1. Is there a way to get rid of the CTE and run the same query using Rails magic?
  2. If not, is there a better way to get the results I want in a nice-looking hash?

Bear in mind that I'm quite new to Rails and not a query expert so there might be an obvious improvement...

Thanks a lot!


Solution

  • I'm trying to find the best way to do a Postgres query with Common Table Expressions in a Rails app, knowing that apparently ActiveRecord does support CTEs.

    As far as I know ActiveRecord doesn't support CTE. Arel, which is used by AR under the hood, supports them, but they're not exposed to AR's interface.

    Is there a way to get rid of the CTE and run the same query using Rails magic?

    Not really. You could write it in AR's APIs but you'd just write the same SQL split into a few method calls.

    If not, is there a better way to get the results I want in a nice-looking hash?

    I tried to run the query and I'm getting the following which seems nice enough to me. Are you getting a different result?

    [
      {"user_activity_id"=>"18", "started_created_at"=>"2014-11-15 16:56:00", "stopped_created_at"=>"2014-11-15 16:57:00"},
      {"user_activity_id"=>"18", "started_created_at"=>"2014-11-15 16:58:00", "stopped_created_at"=>"2014-11-15 16:59:00"},
      {"user_activity_id"=>"18", "started_created_at"=>"2014-11-15 17:00:00", "stopped_created_at"=>nil}
    ]
    

    I assume you have a model called UserActivityTransition you use for manipulating the data. You can use the model to get the results as well.

    results = UserActivityTransition.find_by_sql(query)
    results.size # => 3
    results.first.started_created_at # => 2014-11-15 16:56:00 UTC
    

    Note that these "virtual" attributes will not be visible when inspecting the result but they're there.