Search code examples
ruby-on-railsactiverecordruby-on-rails-3arel

Complex queries using Rails query language


I have a query used for statistical purposes. It breaks down the number of users that have logged-in a given number of times. User has_many installations and installation has a login_count.

select total_login as 'logins', count(*) as `users` 
  from (select u.user_id, sum(login_count) as total_login 
          from user u 
               inner join installation i on u.user_id = i.user_id
               group by u.user_id) g
  group by total_login;

+--------+-------+
| logins | users |
+--------+-------+
| 2      |     3 |
| 6      |     7 |
| 10     |     2 |
| 19     |     1 |
+--------+-------+

Is there some elegant ActiveRecord style find to obtain this same information? Ideally as a hash collection of logins and users: { 2=>3, 6=>7, ...

I know I can use sql directly but wanted to know how this could be solved in rails 3.


Solution

  • # Our relation variables(RelVars)
    U =Table(:user, :as => 'U')
    I =Table(:installation, :as => 'I')
    
    # perform operations on relations
    G =U.join(I)  #(implicit) will reference final joined relationship
    
    #(explicit) predicate = Arel::Predicates::Equality.new U[:user_id], I[:user_id]
    G =U.join(I).on( U[:user_id].eq(I[:user_id] ) 
    
    # Keep in mind you MUST PROJECT for this to make sense
    G.project(U[:user_id], I[:login_count].sum.as('total_login'))
    
    # Now you can group
    G=G.group(U[:user_id])
    
    #from this group you can project and group again (or group and project)
    # for the final relation
    TL=G.project(G[:total_login].as('logins') G[:id].count.as('users')).group(G[:total_login])
    

    Keep in mind this is VERY verbose because I wanted to show you the order of operations not just the "Here is the code". The code can actually be written with half the code.

    The hairy part is Count() As a rule, any attribute in the SELECT that is not used in an aggregate should appear in the GROUP BY so be careful with count()

    Why would you group by the total_login count? At the end of the day I would simply ask why don't you just do a count of the total logins of all installations since the user information is made irrelevant by the outer most count grouping.